CLINICAL vs. 
STATISTICAL 
PREDICTION 


‘A theordticel alates and 


‘ areview of the evidence 


$3.00 


Clinical versus Statistical 
Prediction 


A THEORETICAL ANALYSIS AND 
A REVIEW OF THE EVIDENCE 
BY PAUL E. MEEHL 


AMONG scholars and specialists in the 
behavioral sciences there is a perennial 
conflict between the advocates of two 
different methods of predicting human 
behavior — the statistical and the clin- 
ical. The question debated by those in 
the field is, Shall we make use of ex- 
plicit, mathematical manipulations of 
data or shall we rely upon the subjec- 
tive interpretation of our data through 
skilled judgment? 

Professor Meehl considers in detail 
the profound philosophical and math- 
ematical issues raised by the conflict 
and presents an extensive review of 
the controversial literature. He sum- 
marizes the available empirical studies 


which have compared the two pre- 


see back flap 


Cluncal versus Statistical Prediction 


A THEORETICAL ANALYSIS AND A REVIEW 
OF THE EVIDENCE 


~sf 


CLINICAL 
+ versus STATISTICAL 


PREDICTION 


A THEORETICAL ANALYSIS 
AND A REVIEW OF THE EVIDENCE 


by Paul E. Mecehl 


University of Minnesota Press, M inneapolss 


Bureau Ednl. Psy. Research 
DAVID HAE TaAIAING COLLEGE 


#020 7 SRT" REE i 
Acces No EERE ics, 


Copyright 1954 by the 


UNIVERSITY OF MINNESOTA ‘ 


All rights reserved. No part of this book may be repro- 
duced in any form without the written permission of the 
publisher. Permission is hereby granted to reviewers to 
quote brief passages in a review to be printed in a 
magazine or newspaper. 


PRINTED AT THE COLWELL PRESS, MINNEAPOLIS 
E> < 


Library of Congress Catalog Card Number: 54-1177} 


PUBLISHED IN GREAT BRITAIN, INDIA, AND PAKISTAN BY 
GEOFFREY CUMBERLEGE: OXFORD UNIVERSITY PRESS, LONDON, BOMBAY, AND KARACHI 


Preface 


Tus monograph is an expansion of lectures given in the years 
1947-1950 to graduate colloquia at the universities of Chicago, 
Iowa, and Wisconsin, and of a lecture series delivered to staff and 
trainees at the Veterans Administration Mental Hygiene Clinic 
at Ft. Snelling, Minnesota. I am indebted to the staff and gradu- 
ate students who attended these lectures for criticisms and sug- 
gestions which have contributed materially to the present form 
of the argument. Conversations and correspondence with Drs. 
E. S. Bordin, Robert C. Challman, Lee J. Cronbach, Herbert 
Feigl, James J. Jenkins, E. J. Shoben, Donald E. Super, and 
Joseph Zubin have also been very illuminating. Although I am 
compelled to disagree with some of his theoretical formulations of 
clinical method, the basic approach and clinical philosophy of my 
teacher and colleague Dr. Starke R. Hathaway are inextricably 
involved in most of what follows. I wish to thank Dr. Morris S. 
Viteles and Dr. Robert Y. Walker for their kindness in making 
available to me their personal copies of the out-of-print Dunlap 
and Wantman study (reference 38). Dr. Richard Melton read 
the manuscript while working on his own thesis (738) and called 
my attention to two additional studies by Borden (15) and Ham- 
lin (48). Dr. Albert Rosen located the Blenkner (12) paper. I am 
indebted to Dr. Charles Bird, Dr. William Schofield, and Dr. 
Charles Halbower for valuable editorial criticisms. 

‘The manuscript of this book has been in substantially its pres- 
ent form since 1950, and I have not modified it as a result of non- 


Vv 


Clinical versus Statistical Prediction 


empirical writings published since then. Because of the special 
role of T. R.. Sarbin’s contributions on the topic under considera- 
tion, I especially urge the reader to consult Sarbin and Taft’s 
An Essay on Inference in the Psychological Sciences (88), which 
treats, in much greater detail and with citation of empirical 
studies, some of the matters I raise speculatively in Chapter 7. 
But since, as I understand it, Sarbin’s view on the main question 
is still fundamentally the same as that which he expressed in 
earlier publications, I have not attempted to incorporate the 
Sarbin-Taft monograph into my discussion. 

Perhaps a general remark in clarification of my own position is 
in order. Students in my class in clinical psychology have often 
reacted to the lectures on this topic as to a projective technique, 
complaining that I was biased either for or against statistics (or 
the clinician), depending mainly on where the student himself 
stood! This I have, of course, found very reassuring. One clinical 
student suggested that I tally the pro-con ratio for the list of 
honorific and derogatory adjectives in Chapter 1 (page 4), and 
the reader will discover that this unedited sample of my verbal 
behavior puts my bias squarely at the midline. The style and 
sequence of the paper reflect my own ambivalence and real puz- 
zlement, and I have deliberately left the document in this dis- 
cursive form to retain the flavor of the mental conflict that besets 
most of us who do clinical work but try to be scientists. I have 
read and heard too many rapid-fire, once-over-lightly “‘resolu- 
tions” of this controversy to aim at contributing another such. 
The thing is just not that simple. I was therefore not surprised 
to discover that the same sections which one reader finds ob- 
vious and overelaborated, another singles out as especially use- 
ful for his particular difficulties. My thesis in a nutshell: 

“There is no convincing reason to assume that explicitly for- 
malized mathematical rules and the clinician’s creativity are 
equally suited for any given kind of task, or that their compara- 
tive effectiveness is the same for different tasks. Current clinical 
practice should be much more critically examined with this in 
mind than it has been.” 


Preface 

It is my personal hunch, not proved by the presented data or 
strongly argued in the text, that a very considerable fraction of 
clinical time is being irrationally expended in the attempt to do, 
by dynamic formulations and staff conferences, selective and 
prognostic jobs that could be done more efficiently, in a small 
fraction of the clinical time, and by less skilled and lower paid 
personnel through the systematic and persistent cultivation of 
complex (but still clerical) statistical methods. This would free 
the skilled clinician for therapy and research, for both of which 
skilled time is so sorely needed. 

Since I am myself a hybrid working clinician and rat psycholo- 
gist, I feel that I am in a favorable position to see somewhat ob- 
jectively, and I do not honestly think I am on either side of this 
debate. But I hope the reader will agree with me that fair- 
mindedness cannot mean a mushy, middle-of-the-road position 
(“everyone is right!”) on each of the issues when separately con- 
sidered. When the major components of this long-standing con- 
troversy are teased apart by methodological analysis, I believe 
one can say some fairly definite things about them individually. 
When such definite positions are taken in defiance of clichés, 
toes are stepped on. Perhaps the most I can hope for is that I 
have stepped on clinical and statistical toes without favoritism. 
I hope that this scattering of my shots will incidentally help to 
disabuse non-Minnesota clinicians of the F— perception that 
there is a clear, monolithic “Minnesota line,” predictable on the 
basis of conventional categories (nomothetic, dynamic, behavior- 
istic, dust-bowl empiricist, global, objectivist, analytically orient- 
ed, and the like). Those who think about clinical issues in such 
terms cannot hope to understand the complexities of the reality. 

Thanks are especially due to my wife Alyce, who knows how to 
protect a man at his work; and to Russell H. Linton, for many 
hours of informal psychotherapy. 

PAUL E. MEH 
University of Minnesota 
June 11, 1954 


vii 


Table of Contents 


Chapter 1 
THE PROBLEM Ee) 


Chapter 2 
SOME PRELIMINARY DISTINCTIONS 10 


Chapter 3 
‘THE RATIONALITY OF INFERENCE FROM 
CLASS MEMBERSHIP 19 


Chapter 4 
THE SPECIAL POWERS OF THE CLINICIAN 24 


Chapter 5 
THE THEORETICAL ARGUMENT OF T. R. SARBIN 29 


Chapter 6 
THE PROBLEM OF THE LOGICAL RECONSTRUCTION 
OF CLINICAL ACTIVITY 37 


Chapter 7 
REMARKS ON CLINICAL INTUITION 68 


Chapter 8 
EMPIRICAL COMPARISONS OF CLINICAL AND 
ACTUARIAL PREDICTION 88 


IX 


Clinical versus Statistical Prediction 
Chapter 9 
GENERAL REMARKS ON QUANTIFICATION OF 
CLINICAL MATERIAL 


Chapter 10 
A FINAL WORD: UNAVOIDABILITY 
OF STATISTICS 


REFERENCES 
INDEX 


129 


136 


139 


144 


Clinical versus Statistical Prediction 


A THEORETICAL ANALYSIS AND A REVIEW 
OF THE EVIDENCE 


~The Problem 


O xs or the major methodological problems of clinical psychol- 
ogy concerns the relation between the “clinical” and “statistical” 
(or “actuarial”) methods of prediction. Without prejudging the 
question as to whether these methods are fundamentally different, 
we can at least set forth the main difference between them as it 
appears superficially. The problem is to predict how a person is 
going to behave. In what manner should we go about this pre- 
diction? 

We may order the individual to a class or set of classes on the 
basis of objective facts concerning his life history, his scores on 
psychometric tests, behavior ratings or check lists, or subjective 
Judgments gained from interviews. The combination of all these 
data enables us to classify the subject; and once having made 
such a classification, we enter a statistical or actuarial table which 
gives the statistical frequencies of behaviors of various sorts for 
persons belonging to the class. The mechanical combining of in- 
formation for classification purposes, and the resultant proba- 
bility figure which is an empirically determined relative frequen- 
cy, are the characteristics that define the actuarial or statistical 
type of prediction. 

Alternatively, we may proceed on what seems, at least, to be a 
very different path. On the basis of interview impressions, other 
data from the history, and possibly also psychometric informa- 
tion of the same type as in the first sort of prediction, we formu- 


8 


Clinical versus Statistical Prediction 


late, as in a psychiatric staff conference, some psychological 
hypothesis regarding the structure and the dynamics of this par- 
ticular individual. On the basis of this hypothesis and certain 
reasonable expectations as to the course of outer events, we ar- 
rive at a prediction of what is going to happen. This type of pro- 
cedure has been loosely called the clinical or case-study method 
of prediction. 

Although all clinical psychologists make use of both sorts of 
predictions in varying degrees, and everyone admits some special 
merits and demerits of each type, it is nevertheless possible to 
characterize many clinicians as favoring one or the other. On this 
attitudinal continuum we would put such writers as Sarbin (85, 
86, 87) at the one extreme together with Lundberg (70) and 
many users of “traditional” personality inventories. One usually 
thinks of Allport (4, 5), Murray (75), the psychoanalytic group 
(e.g., 2), psychiatrists generally, and most of the workers with a 
strong interest in projective techniques as being at the other end. 

It is customary to apply honorific adjectives to the method pre- 
ferred, and to refer Pejoratively to the other method. For instance, 
the statistical method is often called operational, communicable, 
verifiable, public, objective, reliable, behavioral, testable, rigor- 
ous, scientific, precise, careful, trustworthy, experimental, quanti- 
tative, down-to-earth, hardheaded, empirical, mathematical, and 
sound. Those who dislike the method consider it mechanical, 
atomistic, additive, cut and dried, artificial, unreal, arbitrary, in- 
complete, dead, pedantic, fractionated, trivial, forced, static, 
superficial, rigid, sterile, academic, oversimplified, pseudoscien- 
tific, and blind. The clinical method, on the other hand, is labeled 
by its proponents as dynamic, global, meaningful, holistic, subtle, 
sympathetic, configural, patterned, or 
sensitive, sophisticated, real, living, 
and understanding. The critics of t 
to view it as mystical, transcende 
dane, vague, hazy, Subjective, 
private, unverifiable, qualitative, 
uncontrolled, careless, verbalistic, 


ganized, rich, deep, genuine, 
concrete, natural, true to life, 
he clinical method are likely 
nt, metaphysical, super-mun- 
unscientific, unreliable, crude, 
primitive, Prescientific, sloppy, 
intuitive, and muddleheaded. 


4 


The Problem 


There are also some words (e.g., positivistic, behavioristic) which 
are used sometimes favorably, sometimes unfavorably, depending 
upon the views of the speaker. Because of the extensive use of 
polemical words in discussions of the problem, I have listed them 
at the beginning for cathartic purposes so that we may proceed 
to our analysis unencumbered by the need to say them. 

As a reminder of the flavor of this controversy, let us consider 
a few quotations without any attempt at a critical analysis of the 
kind of argument which is offered: 


- - + the global approach at least respects the complexity of per- 
sonality problems and seeks some elementary understanding be- 
fore bursting into figures. (50, p. 50.) 


Such standardization by its very nature ignores the individual. 
-.- + All our theories of personality are at variance with the no- 
tion that the summation of a series of items determined by dis- 
crete frequency tables could ever be expected to give an accurate 
dynamic picture of an individual. (74, p. 288.) 


Moreover, it has been claimed that psychostatistical manipula- 
tions and rigidly objective procedures are less applicable when 
Carried over from the investigation of cognitive functions . . . to 
the more affective aspects of total personality. (89, p. 278.) 


It would naturally be absurd ever to expect standardized tables 
based on statistical research which would enable one to deter- 
mine whether a subject is schizophrenic, neurotic, or any other 
definite personality type—normal or abnormal. . . . There is no 
possibility of a rigid schematization, such as the establishment of 
standardized tables in which the scoring and interpretive value 
of every single Rorschach response would be listed. . . . Such a 
schematization would be incompatible with the basic principles 
of . . . any true personality diagnosis. (65, p. 21.) 


In the latter [nonprojective] tests, the results of every. individual 
examination can be interpreted only in terms of direct, descrip- 
tive, statistical data and, therefore, never can attain accuracy 
when applied to individuals. Statistics is a descriptive study of 
groups, and not of individuals. (79, p. 638.) 


The statistical point of view must be supplemented by the clini- 
cal point of view. (101, p. 134.) 


Clinical versus Statistical Prediction 


. . . present statistical methods deal with averages and proba- 
bilities and not with specific dynamic combinations of factors. 
(20, Pp. 38.) fl 


A mathematical formula is possible and Zubin has attempted one; 
but it is in that rarefied mathematical atmosphere that has mean- 
ing only to mathematicians and statisticians. The present writer 
admired Zubin’s effort, but found himself returning to inspection. 
(9, p. 85.) 

Indeed, psychological causation is always personal and never ac- 
tuarial. . . . This is not to deny that actuarial prediction has its 
place (in dealing with masses of cases) ; it is good so far as it goes, 
but idiographic prediction goes further. (5, p. 156.) 


Tf predictions based on frequency were all that were possible, then 
a Hollerith machine worked on the basis of known frequencies by 
a robot could predict future behavior as well as a sensitive judge. 


(5, p. 159.) 

Many other quotations of this sort could be given, although they 
are more frequent and uninhibited in informal discussions of 
clinical work than in journal articles. 

I became fascinated by this problem at the 1947 meeting of the 
American Psychological Association, where Dr. E. Lowell Kelly 
presided at a symposium on clinical and statistical methods, 2 
joint meeting of the clinical section of the A.P.A. and the Psycho- 
metric Society. Two comments could be made about this session. 
First, it was not very long before the usual arguments developed 
between the “clinicians” on the one side of the room and the 
“statisticians” on the other side. Dr. David Rapaport, for in- 
stance, said that certain statisticians apparently wanted him to 
substitute a Hollerith machine for his eyes and his brain. A second 
comment might be that the meeting was relatively poorly attend- 
ed; which, considering the fundamental importance of the prob- 
lem, seems to me a bad sign. 

For this issue is not a trivial or academic one. In the first place, 
a psychologist’s orientation on the matter has a considerable im- 
Pact on his clinical practice. The degree and kind of validation he 
requires for a clinical instrument before using it to decide matters 
of commitments, shock, lobotomy, and psychotherapy, depend 


6 


The Problem 


upon his conception of validation and his notion of what the 
Phrase “clinical validation” can reasonably mean. It is quite clear 
that a large number, perhaps the majority, of heated arguments 
about projective methods turn very shortly into a clinical-statisti- 
cal controversy. And quite apart from a choice of testing instru- 
ments in the light of their validities, the distribution of clinical 
time is involved. How many hours of time of skilled psychologi- 
cal personnel can be profitably spent in staff conferences or team 
meetings in the attempt to make clinical judgments about the 
therapeutic potential of cases? This problem arises because thera- 
Pists are in shortage. Every hour spent in thinking and talking 
about whom to treat, and how, and how long, is being subtracted 
from the available pool of therapeutic time itself. The clerk or the 
statistician cannot do therapy; hence it is of the greatest im- 
portance to ascertain whether the clinician can do a better job of 
prediction than they can. If he cannot, we are wasting his precious 
time. 

Furthermore, there are in every clinical setting occasions on 
which the predictions which would be made from a straight 
actuarial approach do not agree with the predictions made by a 
clinician. If some class to which the patient objectively belongs 
suggests a certain type of outcome on the basis of previous statis- 
tical experience, whereas the staff member who has been working 
with the patient feels that he understands the problem in terms 
of the individual dynamics of the case, it is necessary to decide 
Whether practical decisions should be based on the actuarial find- 
ings or on the insight of the individual clinician. 

‘The professional relationships of the psychologist are also pro- 
foundly influenced by his position on this issue. The use of psy- 
chometric devices, a statistical orientation, and the possession of 
statistical skills constitute unique tools of the psychologist. In 
the matter of history-taking and, if properly trained, counseling 
and psychotherapy, the psychologist, psychiatrist, and social 
Worker are all capable; yet each of the three disciplines has its 
own unique kind of contribution. The professional prestige of the 
clinical psychologist and the kind of professional satisfaction he 


7 


Clinical versus Statistical Prediction 


gets from his work will be influenced profoundly by his orienta- 
tion with respect to clinical and statistical methods. 

In addition, it is desirable to have some rational formulation of 
what we do in practice. Two such apparently different methods 
of prediction should be somehow understood in their logical rela- 
tionship to one another. Which differences between them are 
basic, and which are merely apparent? Why does.one method of 
prediction “work” better in one case, the other in another? In the 
interests of intellectual consistency some rational reconstruction 
of the relationship of the two techniques needs to be given. 

Finally, a clinician’s view on this matter has a considerable im- 
pact on the character of his research. What sorts of things the 
Psychologist decides to study, what methods he employs in study- 
ing them, and (unfortunately) the kind of results he finds depend 
partly upon his position on this clinical-actuarial continuum. 

Some of the questions which are often involved in the clinical- 
statistical discussion may be stated: Which of the two methods 
works better? How much mathematics and statistics should be 
required in the training of clinical psychologists? What should be 
done in individual cases when the actuarial and clinical predic- 
tions are not in agreement? Can it be argued that the statistical 
approach is suited for research but the clinical or case-study ap- 
Proach is the only one suited for clinical practice? Since clinically 
Wwe are concerned with individuals and not group trends, should 
we therefore be paying less attention to the results of statistical 
methods when we work in the clinic? Do statistical niethods im- 
Ply an ignoring of “dynamic” factors? Can statistical methods be 
applied to all phases of projective techniques? If not, what limita- 
tions are there? What is to be substituted for them? Are there 
kinds of questions which it is simply absurd to try to formulate 
statistically? Is there a kind of clinical validation which brings 


its own credentials and is freed of the traditional problems of 


validity? Doesn’t a global approach make statistical procedures 
outmoded? What relation exists between the statistical-clinical 
and nomothetic-idiographic dichotomies? Are these dichotomies 
or actually continua? What about the statistics of the single case? 


8 


The Problem 


(See 7, 28, 29.) How about taking the person as your population 
from which samples are taken? Aren't statistical methods appro- 
priate only to inventories of the old type? Could not all clinical 
inferences be, in theory, made in a formal, statistical fashion? As 
science advances, can’t we expect to see the gradual replacement 
of the clinician’s judgment and synthesis by automatic, cut-and- 
dried manipulation of data? 

Sone of these questions are either pseudo-problems or involve 
large components of pseudo-issues, and others have either mathe- 
matical or empirical answers. All of them have need of semantic 
clarification. 


Some Preliminary Distinctions 


Discvssitoxs of the problem tend to lump together issues which 
are logically independent, simply because of certain sociological 
clusterings in the opinions of psychological practitioners. Thus, if 
Your remarks show that you are favorable to a fairly orthodox 
brand of Freudian theory, others are likely to assume that you 
are global, intuitive, antibehaviorist, projectivist, and nonstatis- 
tical. There is no doubt that certain clusters actually exist in the 
behavior of psychologists, such as described in Murray’s list of 
differences between centralists and peripheralists (75, pp. 6-10) - 
Tf I know that a Psychologist is a Hullian in learning theory, that 
he has done experiments on albino rats, and that he owns a copy 
of Skinner, I can predict somewhat better than chance that he 
will be mildly suspicious of the Rorschach, that he would put his 
bets on actuarial methods of prediction, and that he thinks that 
candidates for the doctorate in clinical psychology ought to learn 
a little Undergraduate mathematics. Nonetheless, there is no 
logical implication from one of these opinions to the others. If 
Jou bet this way you stand to Win, but in attempting a rational 
analysis of the issues involved we must not take these sociologl 
Cal groupings for granted as a basis for argument. 

It should be emphasized that I am concerned in this mono- 
graph wholly with the problem of prediction and am not talking 
about psychotherapy. It is evident that one cannot manipulate 
the behavior of a person by filling numbers into a multiple re" 


10 


Some Preliminary Distinctions 


gression equation. Of course, certain aspects of prediction—e.g., 
prognosis with insulin shock treatment in schizophrenics, or 
choice of interviewing technique—have a direct therapeutic im- 
port. The application of concrete predictions to therapeutic prob- 
lems is a practical issue and will not be treated except tangential- 
ly. I am concerned here solely with the empirical problem of 
making correct predictions about the course of events, and with 
a logical analysis of this enterprise. 

The first clarifying possibility that occurs to me is that there 
may be two different kinds of statistics or, perhaps I should bet- 
ter say, two different ways of applying statistics. I do not have 
any great confidence in this distinction but find it helpful in 
thinking about this issue. There are no standard words for these 
two methods, and I should propose the distinction between what 
may be called the discriminative (or validating) use of statistics 
on the one hand, and the structural (or analytic) use of statistics 
on the other. As a first approximation, we may say that the dis- 
criminative or validating use of statistics is the use which makes 
few or no psychological assumptions about the nature or struc- 
ture of the behavior being investigated. The use of such methods 
is almost wholly neutral as regards theory. The only assumptions 
made are certain very basic or broad assumptions, usually direct- 
ly confirmable within the data, involving such things as the shape 
of the population frequency function and the randomness of cer- 
tain series. In the pure case of this use of statistics, the only as- 
sumptions required are those of the theory of probability. Even 
here, the empirical conditions for applicability—e.g., the exist- 
ence or nonexistence of randomness— can usually be subjected 
to a direct empirical test within the material collected. 

‘Typical questions of the discriminative or validating type 
Would be as follows: “Is the trait or attribute » associated in any 
Way (not merely in the sense of Pearson 7) with the attribute 
Y in a group of persons defined by so-and-so?” “When Mr. A 
mentions his brother in an interview, is he more likely to talk 
about his thwarted ambitions than he is in those interviews in 
Which he does not make any mention of his brother?” “Can El 


11 


Clinical versus Statistical Prediction 


group of educated judges match these personality sketches bet- 
ter than chance with the names of people they know?” “ff I, 
clinician Z, make any use or combination I choose of the MMPI 
profiles of patients called schizophrenic at this hospital, in the 
attempt to predict the rated outcomes of insulin shock therapy, 
can I do so significantly better than I could by flipping pennies 
or entering a table of random numbers?” The prototype of this 
kind of statistics, it seems to me, would be the method of cor- 
rect matchings. We do not, except in designing the experiment, 
make any implicit assumptions concerning the judges, the kind 
of data they are using, the mode of combining information, etc. 
The use of statistics consists in a direct application of pure com- 
binatorial analysis in which the reference base is the “chance” 
hypothesis; and the probabilities by, for example, Chapman’s 
tables (830) are precise upon this basis. 

As distinguished from the discriminative or validating use of 
statistics, I have proposed the term structural (or analytic) use. 
This use of statistics presupposes certain empirical assumptions 
about the behavior—or constructs used to “explain” the be- 
havior (71) —which are not themselves directly confirmed in the 
analysis. Tf these assumptions are false, or to the extent that they 
are poor approximations, the inferences are untrustworthy. Often 
the complete statement of the required hypothesis concerning the 
behavior or constructs may be of a high order of complexity. As 
examples of such a use of statistical method, I would consider 
Such inferences as these: “I have solved the multiple factor equa- 
tions backward from this individual’s test scores, and I conclude 
that he has an amount a + e of Factor II as a primary ability.” 
“The orthogonal solution of the intercorrelation matrix of these 
symptoms indicates the presence of a psychological dimension 
hysteria-dysthymia which is similar to the extravert-introvert 
continuum and which is uncorrelated with a trait of general neu- 
roticism.” “The analysis of covariance indicates that the observed 
differences in trait A among different social strata are attributable 


solely to differences in verbal intelligence.” The prototype of this 
use of statistics is factor analysis. 


12 


Some Preliminary Distinctions 


I do not know whether it is possible to assign most statistical 
tools or techniques to these two classes without regard to the par- 
ticular use to which one puts them. Even in the case of factor 
analysis, if one is willing to look upon the factor matrix as “noth- 
ing but” an arbitrary simplification of an intercorrelation matrix, 
no psychological issues are involved. It is difficult to see what is 
the value of such an approach, for either theoretical or practical 
purposes. If we are interested in a straight prediction problem, 
as Burt has pointed out (21) factor analysis cannot enable us to 
improve upon straight regression procedures, where the sampling 
problem has been better worked out. If it is the intention to use 
results of the analysis for the improvement of testing instruments 
so that they will have greater inherent validity and “purity,” the 
likelihood of achieving this depends upon the adequacy of our 
Psychological inferences made as a result of the factor analysis. 

If, for example, a particular solution of the rotation problem 
gives us three factors which do not correspond at all to the under- 
lying dynamics (causal agents) which have in fact given rise to 
the observed correlations, we shall not find any particular im- 
provement in prediction when we make up new test items on the 
basis of the pseudo-insight gained from an inspection of the old 
factor matrix. Even such statistically simple procedures as par- 
tial correlations or the discriminant function are discriminative 
Or structural depending upon what we do with them. Sometimes 
Wwe use the discriminant function simply to give the optimal 
weight to the members of a predictive battery, and the assump- 
tion of linearity and absence of pattern interactions of the pre- 
dictive variables are assumptions which are testable within the 
data. On the other hand, we may be interested in making a 
Psychological interpretation of the weight in the discriminant 
function and in speaking about the contribution (in the causal- 
determinative sense) of the dimensions measured. Such state- 
ments are-sometimes made in such a manner that they must be 
considered structural-analytic application of this neutral statisti- 
cal tool. If the purely statistical assumptions are fulfilled, a par- 
tial correlation simply tells us what the correlation surface of two 


13 


Clinical versus Statistical Prediction 


variables is like on a slice of the box determined by looking only 
at the triads of numbers in which the third number has a constant 
value. But in actual research it is very rare that we are willing 
to confine ourselves to such a cautious claim. We want to know, 
for example, whether the relationship between achievement test 
score and socioeconomic status is attributable to the factor of in- 
telligence. The well-known problems involving whether one par- 
tials out too much, what direction the relationship runs, and so 
on arise because the statistical analysis of the data does not make 
these structural-analytic distinctions. 

We may tentatively conclude, then, that this distinction refers 
both to the aim of a statistical procedure and (as a consequence 
of the aim) the assumptions of a nonstatistical character which 
must be made in order for this aim to be reached on the basis of 
the statistical findings. The method of correct matchings, simple 
significance tests, and straight prediction systems will usually be 
found to be discriminative-validating; whereas factor analysis, 
the analysis of covariance, and most applications of partial cor- 
relations will typically be used in a structural-analytic way. 

As I have said, I am not sure of the value of this distinction 
and I am not arguing that it reflects a fundamental logical dif- 
ference between the two kinds. But it seems to me that discus- 
sions regarding the use of statistical methods in clinical work are 
sometimes confused because arguments for or against one of these 
uses of statistics are erroneously treated or reacted to as argu- 
ments for or against the other. For example, in response to a de- 
mand for validation data, clinicians will sometimes state that 
they “do not work in a mechanical, additive way” and that the 
usual statistical procedures are therefore not applicable to their 
clinical behavior. More often than not, this is hokum. Again, some 
clinicians object to factor analysis because it uses basic equations 
with no cross-product terms, and because the assumption of con- 
stant factor loadings over the population is implausible. Here the 
clinician is (I think perhaps validly) calling into question 2 
psychological presupposition needed for a particular structural- 
analytic use. But this does not in the least free him from the obli- 


14 


Some Preliminary Distinctions 


gation of showing statistically that his own predictions, on dif- 
ferent assumptions, tend to be correct. That is, it does not en- 
able him to avoid the discriminative-validating use of statistics. 
Unless these functions are separated, confusion results continually. 

A second distinction is that between the source or type of in- 
formation employed in making predictions, and the manner in 
Which this information is combined for predictive purposes. It 
appear3 to me that Allport has contributed somewhat to this 
confusion. I should distinguish first, as regards data, between 
Psychometric and nonpsychometric kinds of information. As a 
completely different dichotomy (or continuum) I should dis- 
tinguish as regards method between mechanical (or formal) 
methods of combining data and nonmechanical (or informal) 
methods (so-called judgmental, clinical, impressionistic, or sub- 
jective) . 

With reference to the kind of data used, by psychometric I 
mean tests in the fairly strict use of that term. If the data arise 
from a systematic behavior sample having the following four 
cardinal properties of a psychological test, I shall consider them 
Psychometric: (1) standardized conditions of administration, (2) 
immediate recording of the behavior or behavior products, (3) 
Objective classification of the responses (“scoring”), (4) norms. 
It seems that this division between psychometric and nonpsycho- 
metric samples of behavior is also actually a continuum rather 
than a dichotomy. Any kind of information which is not based 
Upon tests in the above-defined sense I shall call nonpsycho- 
metric or case-study data. Examples of this would be remarks 
made during an interview, the social history, a police record, a 
rating by the examining physician, facts about present marital 
Or employment status, subjective impressions from the patient’s 
Voice, expressive movements, etc. Note that case-study data need 
not be “subjective” or “impressionistic,” although they may be. 

As for the combining method, by mechanical (or statistical) I 
‘mean that the prediction is arrived at by some straightforward 
application of an equation or table to the data. I do not mean 
the word in its usual pejorative sense. This table, let me empha- 


15 


Clinical versus Statistical Prediction 


size, does not have to be a table of individuals. The elements of 
such a table may be episodes or occasions in the life history of 
one person. The defining property is that no judging or inferring 
or weighing is done by a skilled clinician. Once the data have been 
gathered from whatever source and of whatever type, the pre- 
diction itself could be turned over to a clerical worker. By non- 
mechanical or informal methods of combining I mean those of 
any other sort. It must be stressed that “nonmechanical” is not 
to be identified with “intuitive” or with any mode of combining 
data that has the connotation of subjectivism or irrationality. 
It may be intuitive in special cases; on the other hand the clini- 
cian making this sort of prediction may give explicit reasons for 
his predictions from the data but they are not a mechanical con- 
sequence of a table or equation plus rules for applying it. That 
Sherlock Holmes does not employ an actuarial table is not tanta- 
mount to saying that his procedures are nonrational! (A. minor 
point here is that the clinician may talk about a score which in 
itself is actuarial in the sense that it is, say, a sigma score. But 
unless there is some direct and strict relation between this score 
and a prediction that is tabulable, he is not predicting mechani- 
cally in the way I am using the term.) 
It is obvious that before we ask what mode of combination of 
data is used in reaching a prediction, the data have already to 
be somehow given. Thus, a statistical clerk may combine, by 
purely mechanical, explicitly stated rules, sociometric judgments 
made by fraternity brothers. Given such judgments, the clerk 
is proceeding statistically in my sense—the clerk needs only to 
be able to read, write, and figger to get out a prediction. In the 
extreme case the clerk might not even know the source of the 
ratings, or the empirical meanings coordinated to the numbers 
he is given and the predicted score at which he arrives. But if we 
inquire into the fraternity brothers’ Judgments themselves (which 
might even be couched in predictive terms, e.g., “Who would be 
the best arranger of a picnic?”), these judgments are not arrive 
at mechanically or statistically, in my sense. They. are human 
judgments, the rules for which are buried in the judges’ heads; 


16 


Some Preliminary Distinctions 


Wwe cannot train a clerk to observe the subjects’ behavior and 
then, by straightforward mechanical means, duplicate (except for 
Ordinary clerical errors) the judges’ judging behavior. I am not 
here concerned with which would be better in predicting the final 
criterion of picnic management; I am simply pointing out that 
there is an obvious, noncontroversial operational difference be- 
tween the clerk’s activity and the judge's, a point which is estab- 
lished immediately we realize that a second clerk (or a machine) 
can be easily made to duplicate the predictions of the first start- 
ing with the same data, a possibility no one seriously claims with 
respect to the judgments of the judge. Whether what the judge 
“adds” is error or not is here quite beside the point. h 

We see from this example, however, that the question “Is this 
prediction clinical or statistical?” is likely to be an ellipsis. The 
expanded form would be “Is this prediction, given such-and-such 
data expressed in so-and-so form, clinical or statistical?” Thus, 
We have immediately a question of levels, in the sense that the 
transition from a certain class of statements, scores, or behavioral 
adjectives to the prediction proper may be purely mechanical, 
following explicit rules; whereas this evidential class itself may 
consist of members all, some, or none of which were arrived at by 
human judgment, at least partly inexplicit. There is no need for 
persistent ambiguity here, since in any real case we can specify 
the level of data with respect to which the query “Clinical or 
Statistical?” is being raised. That the answer varies as we treat 
different levels or stages of the same total predictive process is 
only to be expected. The use of a Stanford-Binet score in a re- 
gression equation is statistical, and starting with the score as the 
datum, this regression method can be significantly compared 
with a competitor prediction by a clinician looking at the same 
set of numbers. Yet at a much lower level, the scoring of the 
individual item responses, there occurs a process of human judg- 
ment which, no matter how reliable it can be made by short 
training, is still not quite clerical or mechanical in character. In 
several of the empirical studies we shall review in Chapter 8 (e.g., 
Wittman’s, 103) the reader should bear this matter of levels in 


hls 


Clinical versus Statistical Prediction 


mind, since judgmental components enter into the total predic- 
tive chain at a level below that for which the crucial comparison 
of clinical and statistical is being made. 

Let us pause for a moment to consider the fact that all four 
combinations of data with methods are constantly occurring in 
clinical practice. For this reason any discussion of the problem 
that does not distinguish between method and data is likely to 
lead to confusion: ie 

1. Psychometric data combined mechanically. An intelligence 
test and a test of reading speed are combined in a multiple re- 
gression equation for the prediction of college grades. 

2. Psychometric data combined nonmechanically. A clinician 
skilled in the interpretation of the Strong Vocational Interest 
blank, the Rorschach, or the Minnesota Multiphasic gives a per- 
sonality description and guesses a prognosis from inspecting 2 
profile of one of these devices. 

3. Nonpsychometric data combined mechanically. Parole pre 
diction tables in criminology use data such as age at first sen- 
tence, size of community, and marital status, but these data are 
combined by a statistical table in a mechanical fashion to arrive 
at a prediction. 

4. Nonpsychometric data combined nonmechanically. On the 
basis of the history, an interview, and observation of the patient's 
behavior on the ward, a psychiatrist decides to give the patient 
electroshock. ্‌ 

More complex combinations also occur. A very common one 1s 
the combination of high school rank with ACE score in a regres" 
sion equation to predict academic grades. This is of course an 
instance of psychometric plus nonpsychometric data, combine 
mechanically. The most common case of all in clinical practice 
is that of psychometric plus nonpsychometric data combined non" 
mechanically, where we have the history, an interview, ward be 
havior, and the results of standardized and semistandardized 
Psychological examinations combined in a staff conference in the 


attempt to yield a diagnosis (in the broad sense of that word) 
which in turn entails some sort of prediction. 


18 


The Rationality of Inference from 
Class Membership 


Onxs point which I feel is really crucial is Allport’s seeming im- 
plication that inference from class membership is somehow in- 
herently fallacious. He does not explicitly state this, but some of 
the arguments leave one wondering if he does not believe it. For 
instance, in his monograph on personal documents we find the 
following paragraph: 

Where this reasoning seriously trips is in prediction applied to 
the single case instead of to a population of cases. A fatal non- 
sequitur occurs in the reasoning that if 80% of the delinquents 
Who come from broken homes are recidivists, then this delinquent 
from a broken home has an 80% chance of becoming a recidivist. 
The truth of the matter is that this delinquent has either 100% 
certainty of becoming a repeater or 100% certainty of going 
straight. If all the causes in his case were known, we could pre- 
dict for him perfectly (barring environmental accidents). His 
chances are determined by the pattern of his life and not by the 
frequencies found in the population at large. Indeed, psychologi- 
cal causation is always personal and never actuarial. (5, p. 156.) 

In general, I agree with the content of this paragraph and ad- 
mit the importance of Allport’s point. However, the phrase “a 
fatal nonsequitur” could be a source of confusion, because one 
gets the impression that Allport believes it is a nonsequitur be- 
cause it is based upon an inference from the fact of class mem- 


19 


Clinical versus Statistical Prediction 


bership. I should like to stress that if nothing is rationally infera- 
ble from membership in a class, no empirical prediction 1s ever 
possible. There is, in Allport’s paragraph, a subtle implication 
that by nonactuarial methods you can predict “for sure.” It is in- 
teresting to note that in spite of his dislike for actuarial concepts 
he begins the crucial sentence with “His chances are determined.” 
The whole notion of someone’s “chances” is, as Sarbin has em- 
phasized, an implicitly actuarial notion. h 

The superiority in some cases of making such predictions from 
a study of the occurrences in the individual life over trying to 
make them on the basis of his membership in a class of persons 
can be established without departing from actuarial reasoning if 
we construct a table such as that shown on page 21. Here, situa- 
tions are represented along the horizontal—e.g., days of the 
week —and persons along the vertical. The marginal totals in the 
table give us the over-all frequencies for situations such as the 
Probability that a person will go to the movies on Saturday night 
if we know nothing about the person; the corresponding marginal 
totals going the other direction give us the probability that Pro- 
fessor A will go to the movies when we don’t know. which night 
itis. It is apparent that in general the maximation of “hits” will 
be achieved when the probability figure used to arrive at our 
prediction is that of the smallest possible subset, i.e., Par rather 
than P, or Pi (and, a fortiori, P). Special cases exist in which it 
makes no difference. 

It should not be implied, as Allport seems to, that we can 

+ Aways do better knowing the frequency for Jones than we can 

knowing the frequency for the class to which Jones has been 
ordered. In the event that the modal frequency is attached to the 
identical prediction whether the analysis is by situations or by 
Persons, we would be predicting the same thing by both methods 
and the success frequency for the table as a whole would be the 
same. However, if there is at least one row or column in whic 
We Would reverse the prediction on the basis of the subclass fre 
quency, we will stand to improve our guesses. It is obvious tha 
the best prediction would be that based upon the P value for 2 


20 


Inference from Class Membership 


SITUATIONS 
| 2 3 4 5 6... 7 doldl 


<! 
or: £ . সর 4 . . i . 
3p) 
a 
ll. LC) . . . . . ° LC) 
aE 


Z Pj, fs ° i & Rr FR 
Pr. P 


Total Pi F2 


given entry, i.e., for what I shall call an occasion, meaning a 
person-situation interaction. To carry the argument further, it 
might be that we could improve even over this kind of guess if 
the situations for Professor A were themselves ordered as to time, 
so that whereas the over-all frequency is .75, an analysis for such 
a time series would lead us to conclude that the relative fre- 
quencies were not random with respect to Successive occasions. 
These are fairly obvious points but I stress them in order to make 
clear that Allport can defend his interest in Jones as an indi- 


21 


Clinical versus Statistical Prediction 


vidual without departing at any point from an analysis which is 
still essentially statistical; for his conclusions can be based simply 
upon an analysis of certain class inclusion relationships among 
frequencies. I do not, however, wish to defend the Lundberg- 
Sarbin position that all prediction is of this sort, as will be clear 
from the discussion of the probability of hypotheses in Chapter 6. 

As to Allport’s emphasis upon the distinction between predic- 
tion from categories and predictions for the individual, it should 
be clear that in principle all laws even of the so-called causal- 
dynamic type refer to classes of events. “Adding more informa- 
tion about the person” is taken by Allport and Alexander (2) 
as a relatively unanalyzed idea. But a case can be made that this 
always consists in assigning him to a still narrower subclass, that 
is, to a class having more restricting properties. The question of 
the optimal subclass has been considered by Reichenbach (82, 
P. $16) and from his point of view there is no such thing as the 
probability of an event. There are as many probabilities as there 
are specifiable classes. No one of them is any truer than the other, 
but nevertheless from the standpoint of prediction, there is a best 
class, and this best class is always to be defined in the same way-* 
It is the smallest class, i.e., extensionally smallest and intension” 
ally most complex, for which the N is large enough to generate 
stable relative frequencies. 

Paradoxically, the uniqueness of individual events which All- 
Port is at such pains to emphasize in all his writing forces us 
assume that it is rational to entertain expectancies about la 
future on the basis of class membership. The alternative views i 
made explicit, would have to be something like this: “Nothing 
can be rationally said about an individual instance on the basis © 
its class membership, because the members of the class differ with 
respect to other Predicates than the defining one, or differ on 
some quantitative dimension as regards the defining predicate 
itself, or there is a qualitative di 


a পকা e 
ন ion oe flerence in arriving at the sam 
mensional point.” 


নি Even the ordinary practical decisions 
everyday life become strictly impossible to rationalize if one 
really argues Consistently that it is not rational to decide in any 


22 


Inference from Class Membership 


particular instance on the basis of a known or estimated fre- 
quency in some class to which the unique instance belongs. 

This can be made very clear by considering the case of a re- 
gression system leading to a multiple R. of .999. Surely Allport 
Would not deny the rationality of predicting the individual sub- 
Ject’s behavior on this basis. But if this is reasonable, is not .990 
reasonable? And then, why not .90, and thus .75, and, to be con- 
sistent, .95? Surely there is no miracle that renders such predic- 
tion suddenly irrational, no discontinuity in the situation such 
that, say, to predict for an individual when R. = .9 is legitimate, 
but not when R. < .9. The only conceivable discontinuity in the 
logic as related to the statistics would be at R = 1.00; but if All- 
Port were to maintain that it is irrational to predict for individ- 
uals when the prediction system involves an R in the open in- 
terval (—1, 1) he would have to abandon all prediction, and not 
Only in the social sciences at that! 


The Special Powers of the Clinician 


Srourrer (95) has treated the question “What can the clini- 
cian do with his facts beyond that which can be done by the 
mechanical application of an actuarial table or a regression equa 
tion?” In his discussion Stouffer chiefly emphasizes the fact that 
the clinician can in special cases give more weight to a factor 
than it is given in the actuarial table. On what basis can he valid- 
ly do this? As has been pointed out (e.g., by Lundberg (70, P- 
382) ), if he does so, he must be using some law or other based 
upon his previous experience, and this law, argues Lundberg, is a0 
tuarial. The sense in which Lundberg’s use of the term “actuarial 

in this context is legitimate we shall consider later. At least it is 
admitted by all that there are special instances in which the 
clinician can apply some knowledge which is not included in the 
table or which, if it is, is not given the weight that he feels i 
should be given in the case at hand. 

Whether the clinician tends to improve over the table under 
these conditions is an empirical problem. For instance, suppose 
that in the table given in Chapter 3 we are trying to predict 
whether a given professor will attend the movies on a given night 
On the basis of the values in this table and a failure to show 255 
time-series change in the relative frequency when the occasions 
are ordered as to time, we arrive at a probability of .90 that Ee 
will attend the neighborhood theater, the present night being 
Friday. The clinician, however, knows in addition to these facts 


24 


Special Powers of the Clinician 


that Professor A has recently broken his leg. This single fact is 
sufficient to change the probability of .90 to a probability of 
approximately zero. Sarbin or Lundberg might reply to this that 
either such a fact is important in the prediction or it isn’t. Tf it is 
important, that is, if it should be taken into account (whether 
the clinician thinks it should or not), it can in principle be dis- 
covered by the use of actuarial tables. If the word “actuarial” is 
used this broadly, so as to be synonymous with “inductive,” I 
doubt that any clinician would care to argue the issue. Whether 
this is a useful way to use the word I shall consider below. I 
should like merely to point out that a statistical study of a large 
number of professor-situation occasions of the. present type, in 
Which factors were decided upon on the basis of the establishment 
Of statistically significant differences between the movie-goers 
and the non-movie-goers, would presumably not result in the 
isolation of broken legs as an important variable. The simple 
Teason is, of course, that this is a factor of extreme rarity in both 
of the criterion groups. In other words, such a factor does not 
appear as statistically important in the mass event, but if the 
clinician knows this fact in the case of Professor A he (correctly) 
allows it to override all other data in the table. 

The actuary may counter by saying, “If the factor in question 
is so rare, why bother with it?” It is the tremendous interest in 
the individual case that defines the clinician. Furthermore, speak- 
ing of the mass of cases, there may be many (different) rare kinds 
Of factors. The cases which they largely determine add up to a 
Very sizable minority of all the cases for which prediction is made. 
The situation is somewhat like the old paradox that “an improba- 
ble event is one that hardly ever happens, but nevertheless some- 
thing improbable happens almost every day.” An improbable 
factor of a given type may occur with extreme rarity, but im- 
Probable factors as a class, each of which considered singly will 
Dot appear in a statistical analysis as significant, may contribute 
heavily to the “misses.” 

In passing, it may be pointed out that these rare cases furnish 
One of the respects in which the human brain can be a very sen- 


) 25 


Clinical versus Statistical Prediction 


sitive indicator. To take a simple example, I recently attended a 
staff conference in which one of the psychologists made a correct 
diagnosis of an absent patient after hearing an inadequate, if not 
downright misleading history, because he recognized the profile 
pattern on the MMPI as very similar to an unusual pattern he 


had seen over four years previously in a case of alcoholic hallu- 


cinosis. It is of course true that with a sample of only one case, 
there is a very sizable chance that he is wrong. (It just happened 
in the present instance that he was right, so he now has N = 2!) 
This raises an entirely different although legitimate question, 
namely, that of validating the clinician. I agree with Sarbin and 
Lundberg that this is a thoroughly actuarial problem, involving 
the discriminative use of statistics. We need here Reichenbach'’s 
distinction between the “context of discovery” and the “context 
of justification” (82, p. 7). The clinician may be led, as in the 
present instance, to a guess which turns out to be correct because 
his brain is capable of that special “noticing the unusual” and 
“isolating the pattern” which is at present not characteristic of 
the traditional statistical techniques. Once he has been so led 
to a formulable sort of guess, we can check up on him actuarially. 
The whole problem of the miraculous brain is intimately in- 
Yolved in the problem of clinical and statistical methods of Pre" 
diction. Clinicians often hold the view that no equation or table 
could possibly duplicate the rich experience of the sensitive work- 
er. Here psychology has its precedent from medicine, in the ol 
country doctor who is a “brilliant clinician,” as evidenced by the 
fact that he seems to be able to “smell” diphtheria merely by 
walking into the sick room. Murray has stated that no instru” 
ment could have the analyzing and integrating power of the 
human brain. Having extracted from a man the best possible 
explicit verbal description of his wife’s face, we can without mu¢ 
difficulty find hundreds of women in a given community who 
meet the requirements of his description; and yet the man himse, 
Ens mm a split-second glance to distinguish his wifes 
Ce rom the others. Opposed to this kind of datum, one may 
ask whether modern fire-control methods could have been com 


26 


Special Powers of the Clinician 


structed by the use of clinical intuition, or without the aid of 
explicit mathematical analysis? It is not difficult for the pro- 
tagonists of the clinical or actuarial view to cite both kinds of 
evidence, to show either that the brain is a good instrument or 
that it is a relatively poor one. 

I do not feel that there is enough empirical evidence at hand to 
decide this question or, better, to determine in which situations 
the brain is a powerful device and in which it is relatively weak. 
An obvious hypothesis, suggested by such researches as those of 
5S. G. Estes (40) and the mass of material gathered by the 
Gestalt psychologists, is that the brain’s “superiority” shows up 
heavily at the level of perception itself. At the level of subtle 
cues of a primarily social type, any normal person has had a very 
long history of rewards and punishments with respect to re- 
sponses to such cues. Responses to certain configurations of sense 
data as being indicators of the inner states of other organisms are 
presumably acquired very early. It is even possible that some of 
these configurations do not require to be learned but are given 
as part of our biological heredity. (Cf. Goodenough, 46.) If the 
term “facts” is used with sufficient broadness to include percep- 
tual facts of this type, we return to the argument about actuarial 
methods of combination when the facts are given, as, for example, 
immediate or impressionistic clinical judgments. In other words, 
if we are willing to call such immediate impressionistic responses 
to social cues “facts” in the sense that the clinician is here oper- 
ating as a testing instrument of a sort, it is still an open question 
whether the fact that the patient acts hostile or dominant ought 
to be given the weight that the clinician gives it in arriving at his 
predictions. An empirical example of this sort of thing will be 
found in the study of Wittman discussed below. 

In any case, psychologists should be sophisticated about the 
errors of observing, recording, retaining, and recalling to which 
the human brain is subject. We, of all people, ought to be highly 
Suspicious of ourselves. For us, the problem of the adequacy of 
the analyzing and integrating human brain is to be approached 
through an empirical investigation of its success. 


2&7 


Clinical versus Statistical Prediction 


I once worked with a psychologist who has been very much 
interested in the clinical use of a certain test. An extremely sensi- 
tive and able clinician, he had administered this test to somewhat 
over 600 patients of varied diagnoses and intelligence levels. To 
many of these patients he had also administered a Wechsler- 
Bellevue Intelligence Test. He stated to me on one occasion that 
he felt he could do a pretty good job of estimating IQ’s from the 
new test, although he had never checked himself against the 
Wechsler systematically. He had “noticed” that on the whole the 
correlation seemed “pretty good.” The correlation between the IQ 
of a group of cases and this clinician’s guesses from his favorite 
test was .04. The point of this anecdote is that this clinician 
Was' quite sure that he could do it, and the fact of the matter 
Was that he could not. It should not be necessary to admonish 
Psychologists on this subject, but recent conversations have con- 
Vinced me that there are some clinical Psychologists who are 50 
busy being clinicians that they tend to forget they are psycholo- 
gists. The kind of skepticism about human observation and in- 
ference Which was engendered in large part by the classical studies 
in the psychology of testimony and by the early work on judging 
people—e.g., that of Hollingworth (55) —can be carried to ex- 
Ente 
Sins i AVE no right to assume that entering ff 
inl n some miraculous mutations and made 3 
SInguar'y free from the ordinary human errors which character 
Le dlogicat ancestors. There are some published ৰ 

yPothesizing and predicting behavior of clint 


cians which ought to make us rather cautious and humble in our 
claims (6, 39, 64, 68, 69, 97, 98) . 


28 


The Theoretical Argument of T. R. Sarbin 


Tus most radical of the recent actuarial debaters is T. R. Sar- 
bin, whose systematic treatment and review of the few experi- 
mental studies appeared several years ago (87). I find it sur- 
prising that only one clinician was impelled to respond (32). 
Since Sarbin has been a practicing clinician, the case he makes is 
all the more interesting and we owe it to ourselves to take his 
arguments with great seriousness. I have learned a great deal 
from study of Sarbin’s paper and for some time was persuaded 
that his position was wholly correct. But I fecl now that the 
argument as he states it requires qualification if not some basic 
revisions. 

‘The course of Sarbin’s argument runs something like this: No 
predictions made about a single case in clinical work are ever 
certain, but are always probable. The notion of probability is 
inherently a frequency notion, hence statements about the proba- 
bility of a given event are statements about frequencies, although 
they may not seem to be so. Frequencies refer to the occurrence 
of events in a class; therefore all predictions, even those that from 
their appearance seem to be predictions about individual con- 
crete events or persons, have actually an implicit reference to 
a class. 

The basic premise by which Sarbin attempts to show that the 
clinician is always predicting actuarially and from classes whether 
he knows it or not is an appeal to the criterion of verifiability (or, 


29 


Clinical versus Statistical Prediction 


as the newer terminology of positivism has it, the criterion of 
confirmability). All empirical statements must be capable in 
principle of confirmation or disconfirmation. If I say before 
throwing a die, “The probability of this die coming up an ace is 

1/6,” how is such a statement to be confirmed? I throw the die 
and it comes up an ace. Obviously the fact of an ace is compatible 
with other statements of the probability, in fact somewhat more 
50, at least in the sense that the thing which occurred was “im- 
probable” by having a probability of less than 1/2. On the other 
hand, if it comes up other than an ace, this statement is still not 
confirmed because there is no principle of probability theory 
which states that the improbable may not occur. It is evident 
that a decision between a statement that the probability of an 
ace is 1/6 and the statement that it is 1/7 cannot be made on the 
basis of the outcome of my throw. Sarbin argues that unless such 
probability predictions are to be completely meaningless they 
must be confirmable in principle and hence they must refer im- 
plicitly to a class. For it is only if we have a reference class to 
which the event in question can be ordered that the possibility 
of determining Or estimating a relative frequency exists. 

I Sarbin applies the same reasoning to the case of prediction of 
single events in the clinical situation. The clinician is interested 
in predicting whether Jones will commit suicide within a year 
The clinician, unless he is actually utilizing actuarial tables, does 
not assign numerical values to these predictions; but as Sarbin 
and Lundberg point out, the appearance of words like “probable” 
le “likely” involves reference to an actuarial notion. The failure 
to realize this sometimes results in amusing paradoxes, as in the 
reference to “his chances” in the paragraph from Allport cited 
above. A similar slip occurs in Alexander (2, p. 441). So that 
Sieiher is seta sated mebienly or in 2 phrase such «8 
pected,” Sarbin sa খা শান a se ye 
Jones will kill Dist sora ome Prolab 2 Ee it 

S of the same type as the prediction abou 


the die. If this prediction actually refers to a single event, i. 


the suicide of Jones, Sarbin argues that it is unverifiable in Prin” 


30 


Theoretical Argument of Sarbin 


ciple and consequently excluded by the verifiability criterion of 
meaning. The only way in which it can have a meaning attached 
to it is by ordering it, as in the case of the die, to a class for which 
a success frequency of such predictions is defined. Therefore the 
clinician, if he is doing anything that is empirically meaningful, 
is doing a second-rate job of actuarial prediction. There is funda- 
mentally no logical difference (and here Sarbin is arguing the 
same position as Lundberg) between the clinical or case-study 
method and the actuarial method. The only difference is on two 
quantitative continua, namely that the actuarial method is more 
explicit and more precise. 

The argument seems to proceed quite inexorably to its end, 
and yet it is very difficult for the clinician to feel as though it is 
an adequate description of what he is doing, even implicitly. We 
do not feel that we are carrying on this kind of enterprise when 
Wwe discuss the hypothetical dynamics of a case in a staff confer- 
ence. Lundberg asserts that this merely means that the clinician 
does not “know” all the information which is contained in his 
Own previous experiences, and that whether the clinician recog- 
nizes the actuarial character of his predictions is irrelevant. It is 
clear that the clinician’s feeling about the matter cannot be used 
as a rational argument, but it perhaps justifies us in scrutinizing 
Sarbin’s development very critically. 

‘The first thing that one might say about Sarbin’s exposition is 
that he does not distinguish carefully between how you get there 
and how you checl: the trustworthiness of your judgment. It is 
clear (and in fact trivial) that the case study leads only to 
Probable judgment, and of course all knowledge about the em- 
Pirical world is confined to probable judgment. In order to assess 
the confidence that we ought reasonably to place in the predic- 
tions of the clinician, it seems straightforward to keep a record 
of his guesses and to determine his success frequency. This pro- 
cedure, as has been pointed out above, is quite independent of 
any analysis of the processes whereby the clinician arrives at his 
Judgment. Such an investigation can be carried on with a pre- 
dicting organism whose mode of operation is completely enig- 


31 


Clinical versus Statistical Prediction 


matic. (Cf. Reichenbach’s clairvoyant, 82, p. 358.) We must 
study the judgments of the clinician and arrive at some reason- 
able statement as to his successes by ordering his predictions to 
a class; but does it follow that anything of an actuarial character 
is being carried on by the clinician himself? That he is operating 
implicitly in an actuarial fashion may be true, but this involves 
a different question from that which is involved in the matter of 
confirming his guesses. 

This brings us to a second consideration, touched upon in- 
directly by Chein (32), which may deprive Sarbin’s argument of 
much of its force. It seems to me that Sarbin is not distinguishing 
between a sentence about Jones and a sentence about the sen- 
tence about Jones. The grammatical form of the prediction as 
Sarbin gives it— “Student X has one chance in 6 of meeting the 
standards of competition in the university” —undoubtedly con- 
tributes to this confusion. But it seems to me that we have here 
to deal with two sentences. 


The first sentence is the prediction proper, “The student X will 
not succeed at the University” or, in our ex 


ample, “Jones will kill 
himself.” 


It is obvious that these sentences are, in their content, 
references to single events, occurring in the life history of par- 
ticular persons; nevertheless they are not excluded by the appli- 
cation of the confirmability criterion of meaning. It is not difficult 
to verify the suicide or nonsuicide of Jones. The suicide of Jones 
is a specific event which will or will not occur and which does 
not present any new problems for confirmation or disconfirmation 
beyond those involved in any particularistic hypothesis. If a year 
hence Jones is dead by suicide, the prediction is confirmed; if not, 
the prediction is disconfirmed. I do not think it legitimate to nj 
voke the confirmability criterion of 
that the clinician is here Proceeding actuarially. When the clini- 
cian says that he intends to speak about Jones and not about # 
group of individuals in making such a prediction to the patient % 
relatives, the court, or a social agency, it is understandable that. 
he should resist Sarbin’s insistence that he is not talking about 
Jones. Here, I believe the clinician is right and Sarbin wrong 


lee 2 s e 
meaning in trying to Prov 


82 


Theoretical Argument of Sarbin 


It is only when we wish to assign a confidence, weight, or 
Probability to such a statement that the meaning criterion comes 
into operation. The sentence assigning this confidence is about the 
sentence which speaks of Jones. In this respect (if we accept the 
generalized frequency interpretation of the probability concept) 
Sarbin is presumably right; but I do not think that even here the 
clinician is “wrong” since I am not aware of any statement by a 
clinician which denies the frequency interpretation of this second 
kind of statement. Allport, for example, is commonly taken as 
standing in the clearest opposition to Sarbin’s view; yet he has 
Several times reiterated the need for studying the accuracy of 
such judgments, the individual differences in this accuracy, cor- 
relates and determiners of these variations, the relation between 
the confidence of a clinician and his tendency to be right, and so 
On. I think all clinicians would agree that if they assign a numeri- 
cal probability to a prediction about Jones, although the predic- 
tion about Jones has a specific content tied to Jones and is itself 
directly confirmable by the individual event of the future, the 
Justification for the probability number must lie in the establish- 
ment of some sort of empirical frequency. 

I should like to point out in elaboration of Sarbin’s discussion 
that there are alternative ways of looking at this probability 
Statement which are, in terms of present formulations, equally 
legitimate. Even if we agree that the assignment of a probability 
number (such as 3/4) to an individual prediction can only have 
meaning in terms of a relative frequency for a class, it does not 
follow that this class is a class of individuals. Nor need it be re- 
peated occurrences in the life history of one person, which is the 
only alternative mentioned by Lundberg (70). Just as there are 
an indefinitely large number of classes to which Jones can be 
Ordered, each of which will have its own “correct” relative fre- 
quency, so also the prediction about Jones may be ordered to 
classes not even defined by the properties of Jones or Jones’ situa- 
tion, but rather by some “non-Jones” characteristics. 

‘The crudest example is to order the prediction (treated as a 
Sentence occurring in the clinician’s verbal behavior) to the en- 


38 


Clinical versus Statistical Prediction 


tire class of sentences the clinician emits qua clinician. This is 
the largest class and although its relative frequency is very stable, 
it is too broad to be very informative. To be sure, the establish- 
ment of such a number for each clinician would be of theoretical 
and practical interest, and a crude guess as to this relative fre- 
quency is made by most of us about our colleagues in clinical 
work. We may define narrower classes, e.g., what is the relative 
success frequency of clinician A when he is concerned with the 
prediction of suicide? Or, what is the relative success frequency 
of clinician A when he is making predictions about patients of a 
given sort? Or, what is the relative success frequency of clinician 
A when he attaches to his individual prediction the statement “I 
am very certain about this one”? 

Sarbin leaves the reader with the impression that there is a 
true probability which the crude mental operations of the clini- 
cian poorly approximate; in point of fact there are hierarchies of 
probability, and only an empirical study of frequencies will tell 
us on which system of classifying the predictions we ought to lay 
our bets. The best bets will be based upon the relative frequency 
of success of predictions for joint (multiple predicate) classes, 
including the clinician, the situation, the nature of the predicted 
events, and all the information about the individual. None of 
these procedures for assigning confidence to the concrete predic- 
tion of the clinician restrict him in the psychological operations 
he goes through in coming to the prediction. It is for this reason 
that we can admit with Sarbin the necessity for attaching an 
empirical meaning to a numerical probability, without imme 
diately concluding with him that the clinician is a second-rate 
substitute for a Hollerith machine. This latter statement, which 
both Sarbin and Lundberg appear to believe, may or may not he 
true; the important point is that whether true or not, it cannot be 
established by Sarbin’s appeal to the positivist meaning-criterion- 

An even more fundamental difficulty with Sarbin’s argument 
might lie in the application of the meaning criterion even to the 
numerical probability. Sarbin’s discussion takes it for grante 
that there is only one legitimate usage of the probability notion: 


34. 


Theoretical Argument of Sarbin 


that is, he holds to the “identity conception” associated with the 
Views of Reichenbach and other frequentists. According to this 
View, all probability statements, whether they refer to the a 
Priori “likelihood” in idealized games of chance, the empirical 
frequencies of insurance statistics, the inferred frequency distri- 
butions of values of unobserved variables (such as components 
of the momentum of a hydrogen molecule), or even the proba- 
bility of theories and hypotheses—all these sorts of probability 
are reducible in principle to relative frequencies; and the justifica- 
tion for a statement of probability always lies, in the last analysis, 
in the establishment of a relative frequency. 

As opposed to this identity conception, we have the distinction 
made by Carnap (22, 23, 24, 25, 26, 27) between probability, and 
Probabilitys, in which an effort is made to maintain an empiricist 
definition of factual meaning without reducing every statement 
of the probability of a hypothesis to a success frequency. I am 
not competent to discuss the technicalities of this argument, and 
will only briefly indicate what I understand to be Carnap’s posi- 
tion. Consider a hypothesis h which we hold with some confidence 
On the basis of evidence e. We say that h is probable upon e to a 
degree D. Tf the statement about the probability of Ah upon e is 
interpreted as itself an empirical statement, then it is difficult 
to give it meaning within the confirmability criterion without 
interpreting it directly as some sort of a relative frequency, e.g., 
by ordering h to a class of hypotheses of a certain sort whose 
relative success frequency in the past is fairly well known. But it 
is hard for people, including many scientists and logicians, to 
think of the probability of a specific hypothesis as a frequency 
Statement, even an implicit one. 

Carnap takes the bull by the horns and attempts to solve the 
Problem by denying that the probability statement relating h to 
eis an empirical statement at all. He argues that the relationship 
Of h to e is a special kind of linguistic relation, different from, but 
analogous to, the relationship that exists between the conclusion 
of a syllogism in deductive logic and its premises. That is, to say 
that #, is probable to a degree p upon the evidence e is to say 


35 


} 
Clinical versus Statistical Prediction 


that certain kinds of formal relationships, discernible by a study 
of the sentences in the light of a knowledge of the semantical 
system of a language, obtain. This kind of probability, which 
Carnap calls “degree of confirmation,” seems to some to be closer 
to what we think of as “support of a hypothesis” than the rela- 
tive frequencies of Reichenbach: It is true that the rules for 
establishing the degree of confirmation of a hypothesis upon its 
evidence have not been worked out in any detail and in fact 
are only described by Carnap in general terms for an extremely 
simple case. For the actual world in which we live, in which the 
various possible “state descriptions” and their weights which en- 
ter into the determination of Carnap’s “degree of confirmation” 
are not even known to us, an actual computation of the proba- 
bilities cannot be carried through. 

In this sense Carnap’s treatment merely gives us a hint as to 
the direction in which a nonfrequency interpretation of proba- 
bilities might proceed. However, so far as I know, the frequentists 
are in pretty much the same position when it comes to the cal- 
culation of actual pragmatic probabilities in scientific hypothe- 
sizing and ordinary life. To decide upon this issue is beyond the 
scope of the present discussion, and in this respect the psycholo- 
gist interested in Sarbin’s point of view will simply have to wait 
upon the further developments in technical inductive logic. 1 
not mean to invoke the name of Carnap in ad verecundiam against 
Sarbin, but it is only fair to point out to clinical readers, who may 
Perhaps be unfamiliar with the logic of science literature, that we 
can quote nonfrequentist scripture when Sarbin quotes frequent" 
ist scripture at us. The logical status of probability concepts * 
one of the most technical and obscure problems of modern philos- 
Ophy and logic of science, and it would be very dangerous for U8 
to draw any such far-reaching conclusions about clinical metho ং 


as Sarbin draws until the lJogicians have agreed upon the sky 
of a solution at least. 


36 


The Problem of the Logical Reconstruction 
of Clinical Activity 


To problem with which we are presented is, on the one hand, 
that of giving a behavioral description of what the clinician does, 
Which is a task of the empirical sociology and psychology of 
science; and, secondly, carrying out a rational reconstruction of 
this activity, i.e. showing from the logical standpoint in what 
Way his predictions are related to their grounds. Most of the 
resistance which I as a clinician feel against the Sarbin-Lundberg 
interpretation of clinical work springs from the belief that al- 
though at bottom, in a most general epistemological sense, their 
analysis is substantially correct, yet it is stated in such a manner 
as to give an oversimplified picture of my clinical activities. Lund- 
berg (70) has endeavored to reduce this sort of resistance on the 
Part of clinicians by arguing that the whole clinical-actuarial is- 
Sue is based upon a misunderstanding, and that if the clinician 
had a really adequate comprehension of the actuarial position he 
Would no longer find the interpretation objectionable. I believe 
that Lundberg is in part correct in this view, but I shall attempt 
to show that some of his reduction of the clinical process to pro- 
cedures which are fundamentally actuarial involves oversimpli- 
fications which, if not technically incorrect, are at least so far 
removed quantitatively from the usual usage of the word “ac- 
tuarial” that the employment of this word is downright mis- 
leading. 


37 


Clinical versus Statistical Prediction 


In what follows I am not concerned with the empirical ques- 
tion of the relative efficiency of clinical and actuarial predictions 
(when these terms are used in the usual sense). This is an ex- 
perimental problem, on which the evidence is as yet inadequate; 
and what evidence we have will be reviewed later in the present 
work. Let me state very explicitly that in what follows in the 
present section I shall be concerned with a purely a priori dis- 
cussion of the clinical method, and am not intending to show by 
any argument whether it is or is not advantageous to make use 
of procedures over and above an actuarial table or a regression 
equation. I shall attempt to show that in principle there could 
be situations in which the Sarbin-Lundberg analysis does not 
hold up as a description of what takes Place, leaving open the 
question as to whether what does take place “pays off? in terms 
of an increase in objective Success-frequency. I am concerned 
here with that part of Sarbin’s argument which is devoted to 
showing that it is irrational to expect the clinician to improve 
upon strict actuarial methods, and the allied aspect of his and 
Lundberg’s position that the clinician is always doing what 
actually amounts to actuarial prediction anyway. 

It might seem at first bl 
tradictory, but this is not 
are maintaining is that fu 


ush that these two Opinions are con- 
the case. What Sarbin and Lundberg 
ndamentally clinical prediction is al- 
Word is understood in its broadest 


Logic of Clinical Activity 

into the optimal prediction of the criterion. If a given variable 
does not, in fact, make any difference, the clinician should not be 
utilizing it; and taking it into account will not in the long run 
have any effect except to reduce his accuracy. If the variable does 
have an effect, this effect is measured by the weight which the 
Variable receives in the predictive equation. There is some weight, 
Or more generally, some manner of combining the variables in 
the predictive function, which is optimal. No combination which 
the clinician can make can, by definition, do better than this 
optimal function. It is practically certain that the clinician’s brain 
will not be able to determine the weight as well as the Hollerith 
machine. Ergo, the clinician cannot possibly do better; and, in 
general, is practically certain to do worse. It is this version of 
the actuarial argument which I wish now to consider in greater 
detail, 

For purposes of discussion let us consider the two extreme 
Cases of the clinical-actuarial continuum without prejudging 
Whether there are any qualitative differences. Let us suppose that 
the factual (observational) material from which prediction is 
made consists of the protocols of a diagnostic interview, a history 
Obtained from a social agency, and results from a couple of psy- 
chological tests, say the MMPI and the Rorschach. We are in- 
terested in predicting whether the patient will respond favorably, 
i.e., remain out of trouble and subjectively relatively free of 
anxiety and conflict, if he goes unpunished for a delinquency he 
has committed and is persuaded to change his occupation from F 
to G and alter his place of residence. Let us take for granted that 
Some reasonably objective criterion of “favorable response” has 
been set up. I shall assume that the clinician is a skilled psychol- 
Ogist with a wide experience of cases, and that his use of statistics 
Of an explicit mechanical sort does not extend in the present in- 
Stance beyond the use of the norm data to express scores on the 
two Psychological tests. Any statistical experience for use vith 
the history and interview material and the psychometrics is 

Uried in the reaction tendencies of this clinician's nervous sys- 
tem. He may or may not give verbal reasons for predicting as he 


89 


Clinical versus Statistical Prediction 


does, but at least he does not appear to proceed in a straight- 
forward mechanical fashion. At the other extreme, let us conceive 
of a large and complex actuarial table or, alternatively, a multiple- 
variable prediction equation in which the variables employed are 
the psychometrics and some quantification based on a classifica- 
tion of events in the history and interview material. A clerical 
worker is to take this material, enter the actuarial table or substi- 
tute in the prediction equation, and grind out mechanically, by 
straightforward arithmetical procedures, without the use of any 
Judgment or interpretive inference (61) a number which repre- 
sents the optimal prediction of the criterion here involved. That 
Wwe are usually concerned to predict several aspects of adjustment, 
and hence would rarely want a single number, is not relevant here. 
And for any tender-minded clinician who objects to the whole 
idea, let him substitute a collection of adjectives such as he 
naturally uses every day. I am interested in a careful scrutiny of 
these kinds of predictive process both from the standpoint of the 
behavior of the predicting organism (clinician or clerical worker), 
and from that of the objective (formal) relation of the prediction 
to its evidence. 

Certain general questions about “lawfulness” and “uniqueness” 
must be considered before we proceed. I shall assume that there 
are general laws such as the laws of drive reduction, learning, 
perceptual organization, and that these laws are known by the 
clinician. This is not to say that I am giving the case to Sarbin 
by denying or even qualifying Allport’s uniqueness thesis. As 
‘Allport has pointed out, his (1937) position does not deny deter- 

, mMinism or lawfulness, since the idiographic approach is entirely 
consistent with the view that such general laws do not preclude 


Uniqueness but are simply the laws de 


scribing “how uniqueness 
comes about” (4, p. 558) . 


This uniqueness is not confined to clinical material, or even to 
the human case, but holds in the study of all sorts of behavior. 
In the laboratory investigation of the behavior of the white rat, 
this Allportian uniqueness holds strictly, and for at least two 
reasons. First, the fundamental laws of the learning process, such 


40 


NS EE COUN 


Logic of Clinical Activity 

as the statement that habit strength is related to the number of 
reinforcements by a simple positive growth function, obviously 
involve the possibility of different values of the parameters. (In 
What follows, the framework of S-R-reinforcement theory is used; 
of course the present argument concerning uniqueness applies, 
mutatis mutandis, to any view.) In the second place, the history 
Of no two rats is identical even in a well-controlled experimental 
study involving the same number of reinforcements. For purposes 
of the development of nomothetic learning theory, it is convenient 
to neglect, as in all scientific abstractions, the many individual 
aspects of the organism’s response and to order all responses 
sharing certain rather rough defining properties to a response 
class. Until single reaction occasions are thus grouped and equat- 
ed, it is impossible even to begin counting responses and hence 
to obtain any measure of response strength, oscillation, ete. The 
Concrete explication, confirmation, or application of a law such 
as that of habit growth already presupposes certain qualitative 
decisions. These are necessary (in any individual case) before 
We can even assign a value of a habit to such a continuum as habit 
strength. We speak of the rat “pressing the lever,” and in general 
We do not pay much attention to the minor variations in topog- _ 
graphy and in the intensive and durational properties of the 
Various instances of what is loosely called a response. 

As Skinner has pointed out, there is a difference between opera- 
tionally specifying and identifying members of a response class, 
Which can usually be done to any desired degree of accuracy, and 
specifying a response class which fractionates the behavior in the 
Way it is fractionated by the organism as a result of its unique 
teactional biography. The criterion for the behavioral reality of 
a response class is dynamic lawfulness. In studying the extinction 
Curve of a rat in a Skinner box, we might choose to count only 
those lever pressings which were made with a force of 4 to 4.5 
grams and with the right paw. While this specifies a class opera- 
tionally, a response class so defined would show a much lower 
degree of orderliness than one simply defined by the fact that the 
lever is pressed. Lewin’s well-known distinction between pheno- 


41 


Clinical versus Statistical Prediction 


typic and genotypic classification is essentially an insistence that 
behaviors ought to be classified together not on a basis of arbi- 
trary topographical or other superficial resemblances, but on the 
basis of their dynamic lawfulness. The definition of response is 
one of the least adequately treated problems in modern rigoriza- 
tions of behavior theory; for example, Hull's Principles of Be- 
havior nowhere gives a general definition of this pivotal notion. 
The fundamental correspondence between the human and ani- 
mal case should not mislead us into neglecting those differences 
which, even if merely quantitative, are of tremendous importance. 
The chief among these differences is in the kind of defining prop- 
erty which is necessary to specify lawful response classes at the 
level of human social behavior. In the animal case, we ordinarily 
have access to an organism throughout its experimental history, 
and we have set up the conditions of reinforcement in such a 
manner that the defining properties of the mazimally lawful re- 
sponse class are relatively simple physicalistic ones. The reason 
that “pressing the lever” is an adequate description of the re- 
sponse in the Skinner box is simply that it is this physicalistically 
defined property of the response class which we, as the experi- 
menters, have made the condition of reinforcement. The mechani- 
cal inevitability of this as we use the recording apparatus makes 
it easy to overlook the behavioral principle reflected. It would 
presumably be possible, even in the rat, to define the properties 
of the reinforced response-class by an ingenious manipulation 0 
the reinforcement history so that a naive experimenter would be 
hard pressed to specify these defining properties by a study of 
the behavior. Any defining properties would be characterizable 
by some disjunction or conjunction of properties of topographic, 
intensive, and temporal dimensions; but it is clear that they could 
be made more complicated than is the case when our experi 
mentation is directed at the nomothetic aim of discovering the 
general laws of learning, and the terms of the disjunction might 
be very heterogeneous. Even at the level of the rat, there are 
complications which have, as yet, hardly been touched. The 
Skinnerian emphasis upon the generic nature of stimulus and 


42 


Logic of Clinical Activity 
response is an important one, but it already obscures the fact 
that there is not a complete equivalence of all members of a re- 
sponse (or stimulus) class, and that the inductive and extinctive 
effect of the emission of topographically different class members 
is not known. The principle of cumulative causation is very im- 
portant here because of the unknown but possibly marked influ- 
ence of generalization effects. For example, the previous history 
Of the rat in the acquisition of chain-pulling behavior may alter 
the characteristics of the modal response; so that whereas, from 
the standpoint of the experimenter, the reinforcement conditions 
are the same as for any other rat, the response is harder for the 
animal. This will result in an alteration of all the parameters of 
the learning process, and change the quantitative characteristics 
of the extinction curve. A rat clinician, ignorant of the previous 
history of chain-pulling experience, might infer, for example, a 
lower state of drive or a generally greater ease of extinction for 
the organism at hand, and thus fall into error. 

In the human case, this generic nature of stimulus and response 
Presents tremendous difficulties to a physicalistic analysis. To 
take an obvious example, how do we classify behavior as aggres- 
sive? Tf Mr. B says things which might imply that he has a com- 
pulsive, anxiety-driven need for economic status, and subsequent- 
ly Mr. A, who is usually bored by talk of money, tells Mr. B 
many things about the tremendous wealth of Mr. C, we are likely 
to take this as indicating that Mr. A is aggressing against Mr. B. 
Furthermore, if we know Mr. B very well, we may realize that 
he is actually not motivated as Mr. A infers and that the “symp- 
toms” of a high economic status drive were actually a function 
Of other aspects of Mr. B’s personality. In other words, Mr. A’s 
tesponse is classified as aggression even though it does not tend 
to inflict tissue injury on Mr. B, does not cause Mr. B any kind 
Of anxiety, and would not be a remark classifiable as aggressive 


When made to any arbitrary member of Mr. A’s culture. The be- 


havior which is important to clinicians always involves, at least 


indirectly, interaction with other human organisms; and the prob- 
lem of specifying response classes and of taking certain reactions 


48 


Clinical versus Statistical Prediction 


as indicative of certain habit strengths or states of need is, there- 
fore, a fantastically complicated one. The relevance of these con- 
siderations to the problem of prediction by the clerical worker 
will appear in the paragraphs below. 

I do not mean to cast any doubt here upon the epistemological 
thesis of physicalism. There is no question as to whether be- 
havior protocols furnish the confirmation base of all psychologi- 
cal assertions about others, nor whether any behavior interval 
can be “described in the physical language.” We are concerned 
here with the classifying of such dated behavior-intervals to yield 
measures of strength and, later, inferences as to the determinative 
inner conditions. “He took off his hat,” “He stood rigidly with 
hands at sides,” “He spoke quietly to the judge,” are all behavior 
descriptions in or close to the physical thing language. But one 
defining property of this set of responses, by which we recognize 
a state of respect, cannot be stated physicalistically. The culture 
Teinforces in such a way that responses may covary in strength 
and yet have no common topography. 

The laws which are of a truly general (nomethetic) nature may 
exist at a much lower (molecular) level of analysis than we 
generally suppose. For example, the Hullian principle relating 
strength of response classes to number of reinforcements as AD 
independent variable may itself be a consequence of Guthrie-typ® 
laws (which is what a Guthrian would presumably argue) . If this 
should turn out to be the case, an ingenious manipulation of the 
animal’s experimental history might yield a single organism for 
which Hullian laws did not hold. This is not to deny determinism 
nor to doubt that general laws exist. Tt is simply to say that such 
laws as usually obtain are themselves derivative. If there is & 
sufficient stereotypy due to the experimental traditionalism © 
ordinary life and of the laboratory, such a fact will not be dis- 
covered. Certain initial conditions, and a sufficient isolation of 2 
particular Physical system, will lead to exceptionless regularities 
AIC SE however, consequences of more fundamental principles. 
A consideration of a system obeying the same fundamental laws 
but with different initial conditions will enable us to discover 


44 


Logic of Clinical Activity 

the derivative nature of the principle that we have been taking 
as completely general. Most psychologists would probably feel 
this to be the case with many laws of (capitalistic) economics, 
for example. All “natural” cats slay rats; but Kuo showed this to 
be modifiable. Another example would be the close approach of 
a very large comet upon the motions of the planets as specified 
by Kepler's laws. The prediction and understanding of the ap- 
Parent irregularities which would immediately arise require a 
Passage to a more basic level of causal analysis as represented by 
the formulations of Newton. For a discussion of the general prob- 
lem of novelty as related to the generality and level of laws the 
reader may refer to the excellent paper by Bergmann (10). 

Let us consider the predictive activity of the clinician in the 
light of these remarks. As clinicians we would usually say that 
to the extent that we do more than a second-rate job of actuarial 
Prediction, we endeavor to form a conception of “this person”; 
and it is from this conception, combined with certain admittedly 
actuarial expectations as to the external events of the future, that 
Our prediction is derived. Most of us would argue, for example, 
that the behavior we are trying to predict is a consequence of 
inner variables and is not a causal consequence of the facts 
utilized by the clerical worker. Everyone admits that behavior 
is determined by the state of the field and organism at the time 
it occurs. The facts of the psychometrics, the history, and the 
interview are not related by direct causal laws to the events we 
Wish to predict. The immediate basis of the predicted behavior is 
the state of the person in conjunction with the assumed future 
state of the stimulating field. There is, because of our lack of 
Specific information (and our lack of knowledge of laws) merely 
& crude and fragmentary relationship between the predictive data 
and our hypothesis concerning the inner state or structure of the 
Person at hand. Tf this were not the case, of course the prediction 
Would not be actuarial, i.e., probabilistic, but would be strictly 
deterministic. (I neglect here what I consider to be Sarbin’s 
Misapplication of the Heisenberg principle to the behavior case, 
for a detailed refutation of which see London (66) .) 


45 


Clinical versus Statistical Prediction 


What the clinician does is to utilize the given facts, together 
with crudely formulated laws, to invent a hypothesis concerning 
the state of certain intervening variables or hypothetical con- 
structs in his patient. On the basis of such diverse evidence as the 
Rorschach and Multiphasic profiles, a slip of the tongue during 
the interview, and a social worker's description of the patient's 
mother, the clinician arrives at such statements as “this patient 
has strong oral-dependent attitudes, against which he has set up 
dominant-aggressive reaction-formations.” Statements of this 
sort, which are often mixtures of propositions about habit strength, 
generalization gradients, topographic properties, drive levels, and 
even the parameters in learning functions and satiation functions 
themselves, are all covered by the phase “forming the concept 
of this person.” I do not believe that as clinicians we ought to be 
threatened or feel depreciated by such a general (and correspond- 
ingly empty!) formulation of our activities. To say that in form- 
ing a conception of a person I am assessing his needs and his modes 
of satisfying those needs (including the all-important need to 
reduce anxiety, and with it the immense collection of self-rein- 
forced habits which we call his defences) in no wise detracts from 
a recognition of the tremendous possibilities for variations an 
complications that arise when a more specific description of these 
needs and habits is undertaken in seriousness. Let us now ask, 
what would Sarbin have to say about this process as contraste 
to the activity of the clerical worker? 

In the first place, he would point out that the “laws” which the 
clinician makes use of are actuarial. Certainly this is true, at least 
in the sense that all laws are based upon inductions, and all in" 
ductions are actuarial in the general sense of Reichenbach. There 
1S No reason why the clinician should be hesitant to admit this, 
$0 long as he detects no equivocation in the word “actuarial,” i.e 
one as te Bhiosopht or ephtnage we at te tom 
io k ‘Surreptitiously changed into the more Cc 4 

5) fs a Which we speak about statistical tables the ele 

’ * Shall rephrase Sarbin’s viewpoint more neutrally 


46 


Logic of Clinical Activity 
and simply say that the clinician ought to admit freely that the 
laws he employs are inductive. This is clearly trivial. 

It is perhaps worth mentioning that some of the laws which 
the clinician uses are not laws in which the present § and R facts 
Occur as independent and dependent variables, respectively. That 
is to say, some of the laws are correlational R-R laws (ctf. Spence, 
98) and others are rather ill-established laws concerning hypo- 
thetical inner events (cf. Feigl, 41, p. 42; Spence, 94, p. 73) . That 
these laws must have been suggested initially by observations of 
behavior, and that they must ultimately be supported by be- 
havioral data, is not tantamount to saying that they are laws 
relating directly the data given the clerical worker to the behavior 
Which she is asked to predict. I am not interested here in the 
question of how well such laws are supported at the present 
time, but simply wish to indicate that the statement “Tf the 
clinician uses laws he must be proceeding on the basis of some 
Previous inductive experience” does not necessarily imply that 
Such laws and his use of them are of the same sort as the multiple 
Tegression equation. 

Can this performance, “forming a conception of Patient A,” 
be duplicated by the clerical worker? I am sure that no one will 
Seriously maintain it can in fact be duplicated by the clerical 
Worker; the question is whether it could be duplicated in principle. 

ere we are on very dangerous ground and I do not have any 
dogmatic pronouncements to make. I should like simply to raise 
Some questions which I think cast doubt on the view that, in 
Principle, the clerical worker could here duplicate the predictive 
behavior of the clinician. In the first place, certain facts will be 
Seen by the clinician to support hypotheses as to the internal 
economics and dynamics of the patient, although instances of 
these facts simply do not occur in the actuarial table. It may be 
Asked, how can they be seen to support the hypothesis, unless 
there is a second-rate actuarial table in the clinician’s head? And, 
if this is the case, all we need to do is to get that table out of the 
dlinician’s head and on paper, and we will shortly discover that 

€ clerical worker can do a better job because the actuarial table 


47 


Clinical versus Statistical Prediction 


will assign a better weight. In spite of the plausibility of this 
argument and a personal disposition in its favor, I remain sus: 
picious of it. What appears to me convincing when thus stated in 
abstract terms, seems very unreal when I consider concrete cases. 
Let me give a clinical example of the sort that I cannot readily 
fit into this mold. 

A patient has been developing insight into her ambivalent at- 
titude toward her husband. She begins to show some gross mani- 
festations of hostility against him; for example, she tears up & 
series of short stories he wrote some years ago, telling him he 

knows perfectly well that they were no good anyway. Do we deal 
here with a relatively unmixed expression of hostility previously 
repressed by the patient, or are there other components in her 
need structure contributing to this behavior? She reports that 
one evening, feeling very nervous, she went out alone to a movie; 
and as she was walking home, wondered if he would be “peace- 
fully sleeping” upon her arrival. Entering the bedroom, she was 
terrified to see, for a fraction of a second, a large black bird (‘a 
raven, I guess”) perched on her pillow next to her husband's 
head. Asked to give her thoughts in connection with a raven, 
she says that she shouldn’t have called it a raven, it was prob 
ably just a crow; in fact she doubts that she said raven in the 
first place. Insistence that she did say raven elicits irritation. She 
recalls “vaguely, some poem we read in high school, I guess 
don’t know anything else about it.» 

What prediction enters the listener’s mind with this reference? 
The prediction is mediated by a miniature dynamic hypothesis 
Ee almost certainly to Poe’s poem; one guesses ke 
bh, Y important content determining her hallucinat! 
1s connected with the preceding thought about her husban' 
peacefully sleeping. The hypothesis forms itself: Nervous 2” 
upset, she Eoes out alone to a movie while her husband, unmin f 
ful of her, is able to “sleep peacefully.” The fantasy is that, like 


Poe's Lenore, she will die or at least go away and leave him alone 


he 2 croaking “Nevermore.” Then he'll be sorry, nS 
able to sleep peacefully, ete. We formulate the further hypothes 


48 


Logic of Clinical Activity 

Which includes our hypothesis about the determination of the 
particular hallucination, that she is concerned about her hus- 
band’s need for her, and would like to know how important she 
is to him. This leads to a prediction as to the leading themes we 
expect in the rest of the session. The prediction has a wide lati- 
tude, i.e., a class character is specified for the behavior, as always. 
But we anticipate that her (unguided) associations will touch 
upon the theme of punishing her husband, by going away some- 
how, that he would be sorry if she did, and the like. We also per- 
mit ourselves some leeway as to time, in that the development 
of the theme may not begin strongly until the next session, etc. 
But we do not make a vacuous prediction, since some manifesta- 
tions of the Lenore fantasy are to be expected, and fairly soon. 
Her subsequent remarks in the same interview return repeatedly 
to the general topic of her husband’s lack of concern for her 
condition, and his “sublime confidence” that she will “never do 
anything rash,” which turns out in further talk to cover both 
Suicide and unexpectedly leaving him. Fortified by these con- 
firmations, we begin to attach considerable weight to the hypoth- 
esis that her hostile reactions are overdetermined, being in part 
attempts at testing the limits of his love and acceptance. Sys- 
tematic attention to this hypothesis is well rewarded in the suc- 
ceeding sessions. 

The interesting question here is this: What are the general 
Statistical uniformities which are allegedly able to generate the 
initial hypothesis? I presume the situation of a woman hallucinat- 
‘Ng a raven next to her husband’s head is unique, and hence can- 
Not define a reference class for any relative frequency, either 

own or unknown. To what larger class can the event be or- 
dered? Tt would be a nonsensical classification, and would com- 
Dletely cut across the categories and dimensions which are really 
Involved here, to consider the obvious larger classes, e.g., having 
hallucinations of birds. I do not suppose anyone would seriously 
Maintain that hallucinating birds is statistically associated with 

© desire to test a husband’s love, or the unconscious fantasy of 
caving him. The general principles involved here are not difficult 


49 


Clinical versus Statistical Prediction 


to state; but what impresses me is their relatively vacuous ৰ 
acter insofar as generating the particular hypothesis 1s COTCGT ie A 

We are making use of such general statements as these: WI jn 
a person describes an experience and subsequently on 
description and refuses with some emotion to admit to his oo 
description, it is frequently the case that the original descrip al 
was correct and that it involves material which is ASHI 
important and which must be defended against.” “The only Er 
involving a raven which is read with any frequency in high se গিট 
classes is Poe’s poem.” “One basis on which a literary produc oo 
may be associated with a situation or state of need in a En 
familiar with it, is an unconscious identification with one of ্‌ 
characters or an identification of one’s situation with that r 
trayed.” These are the principal statistical generalizations W'! 
form the matrix for the construction of the present hypothe 
take it as obvious that the hypothesis could not be mechanic® as 
ground out from these statements, even if the frequency Wo. 
were replaced by numerical frequencies, through an applicat চি 
of probability calculus. The hypothesis “She saw a raven becau 
she was thinking of herself as dead or departed, which wou চট 
injure her husband and also make him realize how important it 
Was” is psychologically suggested by these facts; and IT ape. on 
is fair to say it would not be suggested to a clerical worker, a 
if she were fully cognizant of the meaning of the above Br 
propositions. Tt seems to me that even if we acquaint the থা Le 
Worker with the statistical frequencies and make sure that sti 
understands the meaning of all the concepts involved, Tl oases 
have to create in her a readiness to invent particular hypo? d 
that exemplify the general Principle in a specific instance. 
When we have done this last, which I do not think can be 
wholly by stating general rules, we have trained a clerical wo 
to the point that she is now actually a skilled clinician. pes 

Reik (83) gives numerous examples of clinical hypot 07” 
formation, which are instructive (and discouraging) to try to on 
mulate actuarially. A fascinating case of postdiction base 
only a fragment of behavior during analysis: 


one 
ker 


50 


Logic of Clinical Activity 


One session at this time took the following course. After a few 
sentences about the uneventful day, the patient fell into a long 
silence. She assured me that nothing was in her thoughts. Silence 
from me. After many minutes she complained about a toothache. 
She told me that she had been to the dentist yesterday. He had 
given her an injection and then had pulled a wisdom tooth. The 
spot was hurting again. New and longer silence. She pointed to 
my bookcase in the corner and said, “There's a book standing on 
its head.” Without the slightest hesitation and in a reproachful 
voice I said, “But why did you not tell me that you had had an 
abortion?” (83, p. 268.) 


Reik gives us his introspection on this bit of postdiction, to 
Which I refer the interested reader. But let us ask, how can we 
arrive at such a postdiction actuarially, making the generaliza- 
tions and frequencies explicit so that the clerical worker can 
duplicate Reik? The tooth extraction as a symbol of birth we can 
Put into a crude actuarial “law.” The silence, we can teach our 
clerical worker, is usually resistance, conscious or unconscious. 
Where does this leave us? “There is a probability P that the 
Patient is resisting something about birth.” So far, so good. But 
this interpretation does not have the dramatic, time-saving 
quality of Reik’s; and is much less specific. How work the “book 
On its head” into the actuarial mold? This fragment gives Reik 
his image of the fetus, and hence mediates the final touch of his 
Postdiction. Speaking in very general terms (and it is impossible 
to speak otherwise, not merely because of the inadequate state 
of theory, but because the kind of behavior with which we are 
here dealing has an intrinsic vagueness, involving continuous 
gradation in topography and marked variability from one indi- 
Vidual to the other in the defining properties of the response 
class), we might say that “any words or images which indicate 
Properties belonging to a fetus may acquire induced strength 
from mentation concerning fetuses.” It will be necessary to sensi- 
tize the clerical worker to this very broad defining property, so 
that when a specific member of the response class occurs, never 
before observed and hence not present in an actuarial table even 


of colossal N, he will respond to it as an instance of the class. 


51 


Clinical versus Statistical Prediction 


I do not mean to use the word sensitize in any mystical or Ke 
‘ definable sense. I refer simply to the fact that CREO 
a particular patient’s response by the clerical worker is ত ্র 
a response, which must become elicitable by a great divers 5 
patient responses seen as clinical stimuli. The defining Eo 
ties of this latter class will, in general, not be simple. The major 
of the individual forms (physicalistically defined) vill not { 
listable in an actuarial table, partly because they will ST 
never have occurred in any recorded clinical experience to da ত 
and partly because the number of them, thus botanized, ie 
become too cumbrous for any practical use. The complica ্‌ 
kinds of mutual interaction between internal variables and St 
ternal events which characterizes human clinical material Co 
in a situation in which a response having a specified topograp fe 
emitted in a specified stimulus field, may indicate different 5 
of internal variables depending upon all well-confirmed bys is 
eses about the individual. In ordinary life, we recognize ke 
When we say that the same behavior will suggest, in the ext 
Case, even opposite interpretations when the behavior occurs a 
two individuals concerning whose personality structure we ha 
already considerable knowledge. . 
What I am trying to indicate is that the general laws which 
late the strength of responses to certain antecedent condita 
even when they are adequately worked out, have to do with a 
form of behavior covariation. But we can only talk about Ee 
laws as applied to a particular case when we have already SP 


TE 
fied to some degree at least what end terms (stimulus 15 
sponse class, 


ing other responses by the same individu i 
f relationship which clinicians have ine of 
the necessity for knowing the “ent the 
havior to the “whole person.” Althoug 


52 


when they emphasize 
a given segment of be 


Logic of Clinical Activity 
word “meaning” is sometimes used chiefly for its rhetorical effect, 
1t seems to me that it indicates in this context a genuine prob- 
lem in the classification of behavior. 

‘We could presumably train the clerical worker, both by a feeble 
attempt at stating general properties of a given response class, 
and by the multiplication of many instances, to respond to a 
Segment of a patient’s behavior in this categorical fashion. Un- 
fortunately, the verbal response “there’s a book on its head” is 
only one of a thousand different sensitizations which must be 
achieved if a clerical worker is to be able to order behavior to 
Such meaningful classes. Suppose we indoctrinate the clerical 
Worker with the whole system of dynamic theory by means of 
Which individual behavior segments are seen as supportive of 
this or that particular hypothesis; and then we make this ab- 
Stract knowledge available for practical use by exposing the 
clerical worker to innumerable instances of each sort. It seems to 
me that this is the only way in which we can avoid the conse- 
quences of the uniqueness for the mechanical application of an 
actuarial table. With Sarbin and Lundberg, I argue that every 
skilled clinician must be making use of some laws, however vague, 
Which may be of considerable generality, but which nevertheless 
make it possible for him to order his material with respect to a 
fiven patient in terms of some general nomothetic basic psycho- 
dynamics. The problem is, however, to make these highly general 
laws available to the clerical worker, and to build into her nervous 
System the appropriate reaction tendencies so that she can use 
them in the formulation of the individual case, many if not most 
of whose evidential behaviors will have occurred too rarely to be 
in any actuarial table. That is, a set of kinds of hypotheses such 
as “this man has set up a reaction formation against impulse X,” 
As well as a readiness to perceive a physicalistically diverse col- 
lection of behavior segments as supportive of this or that particu- 
lar hypothesis, must be taught to the clerical worker. In principle, 
Biven sufficient intelligence and motivation, there is no reason 
Why this cannot be done; but as I have indicated above, such a 
trained clerical worker has been made into a skilled clinician. 


53 


Clinical versus Statistical Prediction 


To summarize the argument just completed, one might Re 
somewhat as follows. The so-called general nomothetic laws 0 
behavior are laws relating responses to stimuli via certain I 
vening variables whose states are in turn specified by ত 
e.g., hours of deprivation of food. Presumably the form of t Le E 
laws is genuinely nomothetic for a given species. The parame Es 
vary from organism to organism within the species but are ল্‌ 
principle inferable from values of other parameters and from cel 
tain combinations of dynamic changes in strength which are 
themselves observable. The end terms involved in these laws, 
however, are variable from organism to organism; the same 1s 
true of the intervening variables. That is to say, the teen 
variables relate the facts only via tentative response classes. 
is necessary for the observer to have in mind habits, traits, de- 
rived needs, and the like before he can see how a given behavior 
datum supports propositions concerning the state of these varia- 
bles. E 

If there were a very small number of habits, all manifesting 
themselves in the same way and having little or no variation In 
their topography, it would be a simple problem of inverse proba" 
bility to construct a particularistic hypothesis concerning the 
system of inner states necessary to account for the observed be- 
havior strengths of individuals. But in fact the “SH” involves 
H whose dimensions (or, properties) vary greatly from individua 
to individual and from time to time. Consequently the formulaz 
tion of the state of a particular organism involves the hypothes” 
zation of the forms of a set of Hs (and analogously, a set of D's 
for drives) . It is not feasible literally to list all the tremendous 
collection of such habit and need forms in an actuarial tale: 
First, because really new forms constantly occur; and, secondly: 


: . 2 . 1S 
because particular combinations of such narrowly specified for’ 


i ies i ; inic® 
will have no entries in such a table even from an extensive clin 


experience. On the other hand, it is very difficult to specify such 
classes by their general defining Properties, because in the case Ay 
human social behavior the defining properties are, in general, en 

Physicalistic. One has to think of the hypothetical habits, traits, 


54 


Logic of Clinical Activity 

Or needs, including specifications of individual properties in some 
detail, before he can understand that a given behavior datum 
Supports hypotheses concerning it. If one listed enough concrete 
Possibilities to give an idea of the response class, and indicated in 
terms of general causal laws how they could be grouped together, 
and had the clerical worker overlearn these so that they were at 
sufficiently high strength in her verbal behavior to come out in a 
Particular clinical instance, the clerical worker would have been 
transformed into a skilled clinician. 

No matter how convincing the previous considerations seem, 
One still has the uneasy feeling that something must be wrong. 
I find in myself the tendency to say something to this effect: 
After all, the behavior is lawful. If it is lawful, everything about 
it, including the topography and the dimensions of particular re- 
Sponses, must be a function of some variables which are deter- 
minable. Therefore, it is not possible that the clinician could do 
anything that the clerical worker could not do in principle. The 
Apparent inconsistency of this train of thought with what pre- 
cedes can be, I think, readily resolved. It is a tautology for a 
determinist to say that if we knew all the parameters in all the 
equations of the behavior acquisition functions for an organism 
at birth, and if we knew all the situations to which he was ex- 
Posed, the specification of the response classes would follow 
directly from this knowledge and everything would proceed in a 
‘Mechanical fashion. But these initial parameters, and these 
Previous experiences, are not known to us. I think it is no exag- 
Beration to say that they will never be known to the practicing 
dlinician The experiences which determine the topography of a 
Slven response class and the mutual interrelations between needs 
and habit strength are for the most part permanently inaccessible 
to us when we come to consider the adult organism. The most 
ntastical]y detailed social history and the deepest psychoanaly- 
Ss could only, from the purely physical standpoint of the nature 
Of verbal descriptions, give us a fraction of all the events in the 
teactiona] biography which has determined what the individual 
'S at the present time. Furthermore, these events are not avail- 


55 


Clinical versus Statistical Prediction 


able in any record or anyone’s nervous system regardless of the 
time and effort that we would be willing to put forward in ob- 
taining them. Nobody knows, at the present time, what the 
patient’s older brother said to him at the dinner table when the 
patient was four and one-half years old. What we see before us is 
the cumulative result of literally thousands of single learnings 
and unlearnings, not the least of which are those elusive kinds 
of learning which are involved in the internal responses which 
we call fantasy, not observed by anyone and since forgotten by 
the patient. For this reason, we are perpetually in the situation 
of trying to reconstruct initial conditions from a study of the 
results. It is here that the necessity of being able to think up the 
best hypotheses concerning the organization of the individual's 
personality arises, in spite of the assumption of complete deter- 
minism. 
I think that there may be a formal difference in the process of 
" prediction when it is carried out actuarially (in the Sarbin- 
Lundberg sense of that word) and when it is carried out by the 
skilled clinician via the use of a hypothesis. In the actuarial cases 
let us suppose that the event to be predicted is a simple dichoto- 
my, e.g., violates or does not violate parole. A finite although 
Possibly very large set of facts is known about the individual and 
the particular combination of facts defines a subclass of the 
Population of individuals for which certain relative frequencies 
have been determined. It may happen that the particular com 
bination before us has never heretofore arisen, but that some 
inductions of a higher order have led us to the statement that 
certain frequencies are independent of others, some frequencies 
change in 2 specified manner with a value of certain properties, 
and the like. To arrive at a prediction for the case at hand we 
need only apply the probability calculus in a straightforwar 
fashion and thus arrive at a number which automatically deter 
mines what we predict. While the prediction considered 25 2 
statement about the future is not a deductive consequence in ! 
that it does not follow necessarily but rather in probability’ 
the probability number reached is a purely deductive consequence 


56 


Logic of Clinical Activity 
of the initial set of probability numbers, together with the rules 
of the game. If we now simply add the usual decision to predict 
always the more probable occurrence, the arrival at the predic- 
tion is obviously a matter of sheer deductive manipulation of a 
mathematical sort. 

But if the prediction flows as a consequence of some sort of 
structural-dynamic hypothesis concerning the personality, the 
formal situation is different. For this hypothesis is not itself in any 
Sense a formal consequence, i.e., it is not straightforwardly de- 
ducible from the facts which support it. When the hypothesis has 
been stated, the original data are seen as entailed by it, in con- 
Junction with the general laws and the rules of inference. But 
Someone has to state the hypothesis in the first place. It is in the 
initial formulation of the hypothesis that there occurs a genuine 
Creative act with which the logician, as such, has no concern. 
There is a stage at which someone must have thought up a 
hypothesis which, in the context of discovery, was, to be sure, 
Suggested by the facts, but is not a formal consequence of them. 
Whereas in the actuarial case, the frequency for a subclass is a 
formal consequence of the application of the principles of proba- 
bility to a set of data. 

Consider a nonpsychological analogy. Let us suppose that we 
have before us an opaque box, on one side of which is a row of 
ten buttons. A pressing of any three of these buttons constitutes 
a stimulus so far as the box is concerned. On the other side of the 

OX is a row of ten colored lights, whose pattern of flashing on 
and off exhausts the box’s potentiality of response. Let us sup- 
Pose that the internal mechanism of the box involves a more or 
ess complicated series of interrelated gears, brackets, pulleys, 
Springs, sliding surfaces, and the like. Such a box is capable of 

cing stimulated by one thousand distinguishable stimulus pat- 
terms, Suppose now that we permit an actuary to make a finite set 
of Observations upon the stimulus-response connections of the 

OX for statistical purposes. Certain rough probabilistic relations 
will appear. For example, he might find that when any button Is 
Dressed twice in succession, then, whatever is done on the third 


57 


Clinical versus Statistical Prediction 


pressing, 90 per cent of the time the response involves a turning 
on of six lights. He might also have made such observations on 
numerous boxes of the present sort whose mechanisms were simi- 
lar to (but not identical with) the present one. If we now ask 
him to predict the results of a certain combination of button- 
pressing which he has never tried in his sampling of the present 
box (or possibly not in any box that he had studied), he would 
have to be content to make a guess on the basis of some larger 
class of pressing combinations of which the specific combination 
we mention is a member. 

Suppose now that we presented a similar problem to a skilled 
mechanic who had dismantled many such boxes in addition to 
having observed properties and their frequencies. With a small 
number of pressings, in this case very carefully chosen, he could 
conceivably be led to the formulation of a hypothesis concerning 
the particular structure of the internal mechanism. It is true that 
this hypothesis formed by him might be erroneous. But it also 
might be correct; and if correct, would lead to definite predictions, 
having a very high success frequency. 

It might be objected that we have subtly included actuarial 
information by specifying that the skilled mechanic “has taken 
apart many such machines.” This is admitted. As IT have indi 
cated earlier, if the word “actuarial” is used as an equivalent of 
“experiential-inductive,” then the only clinicians who would deny 
that they operate actuarially are those who claim to be prophets 
and clairvoyants. But I have tried to make clear that this use © 
the term “actuarial” is so broad as to remove all meaning from 
the issue at hand, and to take all the sting out of Sarbin’s argu 
ment. Furthermore, it is one thing to say that it is necessary that 
the mechanic should have certain actuarial data to be able to 
formulate 2 good structural hypothesis, and it is quite another 
thing to say that such actuarial data are sufficient for him to T° 


rive at the prediction, were he not skilled. By this I mean that 


is enabled to invent a particular hypothesis concerning the inner 


workings of the Present box because he has had experience Ls 
such boxes in the past; but the hypothesis at hand is not some 


58 


Logic of Clinical Activity 
thing derivable as a mechanical or statistical consequence of the 
set of frequency statements which actually make up his previous 
experience. Being a skilled mechanic means that, on the basis of 
his actuarial experience, his brain has become capable of the 
creative act involved in formulating a hypothesis about the pres- 
ent unique box. 

I think it is obvious that we could present the statistician with 
2 table of relative frequencies concerning numbers of gears, posi- 
tions of pulleys, and the like for the same sample of boxes which 
the skilled mechanic has dismantled, without having any assur- 
nce that the actuary would be able to invent the correct hypoth- 
esis. Whether in the long run predictions arrived at by the 
Creation of such hypotheses are more trustworthy than those 
Arrived at by a straightforward application of the frequency 
tables is an empirical question which would depend factually 
Upon such things as the degree of complexity of the parts, the 
Skill in hypothesis-making of the mechanic, and the size and 
diversity of the sample available to the actuary. I have merely 
tried here to indicate by a mechanical example the kind of situa- 
tion which I feel is involved in high-level clinical activities. 

A learning history begins with an organism for which the 
Parameters occurring in the functions descriptive of the learning 
Process are different from those parameters in other organisms. 

ne are individual differences in initial behavior readinesses, 
5 in the susceptibility to anxiety, ease of producing crying, 
ৰ 1 like. There are individual differences in behavior aspects 
Shieh Are In some degree irrelevant to satisfactions of the drive 

চ becomes connected to them but which may later on in the 

Ey acquire significance. Such “temperamental” variables as 
hg se expenditure and response tempo may vary widely from 
fopo Cr to member of a response class defined by a very broad 
by Eraphy sufficient to guarantee the reinforcement and hence 

ye Bh the strength. These expressive aspects of the be- 

নহ el may take on a positively or negatively adaptive function 
vidual when, for example, the social stimulus value of the indi- 
Acquires a different sort of relevance for his rewards than 


59 


Clinical versus Statistical Prediction f 
was the case at the time the behaviors were being ৰণ ra 
terned. To the extent that secondary (derived) ন j 0 মহ 
increasingly important role in the determination of ত RANE. 
have even greater possibilities for variations. The goal | SETA 
affairs which are reinforcing are themselves rather ‘CONIp ex ica 
lus configurations and sequences, the Physicalistic LE 
which may be difficult if not impossible to specity for the jis 
group of organisms, even those having a certain homogene a 
the life history as is guaranteed by a common culture. , pt 
rived needs are so heavily stimulus-dependent, they are 
specified by the class of configurations which spptsent ye ET 
them, the usual indicators of docility, etc., being used to ide 
this class. 2 

The uniqueness of the learning history brings about a 2 CT 
ness in the defining properties of the stimulus class which ডা 
stitutes a reduction for one of these higher order needs, and he 
by brings about a uniqueness in the needs themselves. TA ff, 
extent that a very large part of human behavior is maintaine i 
the basis of anxiety reduction and is heavily dependent upo! 
large and complex set of verbal and other symbolic social- La 
self-reinforcements, there is a perfectly legitimate sense be 2 
We can say that the important needs considered by clinical P' in 
chologists are idiosyncratic in a Way that the drives we suc 
the animal laboratory are not. I do not mean to suggest that বে 
hunger drive of a given rat is not unique, since, as stated abo rl 
IT would argue for the literal truth of Allport’s view in the aE 
Case as in the human. But the extent to which the sugarhunh 
of a rat in one of our experiments has about the same quant! is 
tive characteristics and appears in the same role as a variahe he 
in the case of another rat is Presumably much greater thas ity 
extent to which artistic interest shows person-to-person similar 
in the human adult, 


0 $ হে a s iD 
Tt is likely that in addition to marked individual dU চু 
primary stimulus generalization gradients, there are Jan very 
derived ‘or learned Seneralization gradients which result in 

di 


নাছ দি te iffer” 
ferent potentialities. We are presented not only with the diff 


60 


Logic of Clinical Activity 

ences in response dispositions, but with differences in the disposi- 
tion to acquire dispositions of various sorts. The principle of 
cumulative causation operates here so that the effects of rela- 
tively minor fluctuations in initial conditions may produce fan- 
tastically great complications (cf. London, 67). We have, in the 
adult, variations in needs, not merely in their strength but also 
in their defining properties; variations in the defining properties 
of those habits which are cued to these idiosyncratic sets of 
needs; variations in the defining properties of the stimulus classes 
Which perform both the cue and the reward function with respect 
to the need-habit accommodations involved; and finally varia- 
tions in the functions relating some of these to others, as in the 
extent of generalizations from one class of needs to another or 
from one class of habits to another. The system of needs and 
Associated response tendencies including the interrelations among 

em is often referred to as the “personality structure.” The word 
“structure” here is perhaps not too happily employed, since it 
fosters some rather noncontributory imagery. Nevertheless, it 
Seems to me there is a legitimate sense in which the properties 
We think of as structural apply here. 

In the first place, there is an element of relative stability. Cer- 
tain response dispositions may be modified by experience, but 
always in terms of rather permanent second-order dispositions 
Such as referred to above. In order to predict what a given human 

ing will learn when put in a specified situation or sequence of 
Situations, it is not, in general, sufficient for us to have knowledge 
Of the general laws of learning, e.g., the principle of reinforce- 
ment, generalization, the multiplicative function of drive, and 
the like. We ought to know, for example, the behavior readinesses 
(initial strengths) which he brings to the situation, since those 
tesponses with a little greater initial strength will cumulate their 
advantage by occurring and being given reinforcement before 

€ alternatives have an opportunity to appear at all. Even if 

‘eSe readinesses were known and the organization of the en- 
Vitonment were also known, so that we could predict the initial 
Members of the response series and their stimulus consequences, 


61 


Clinical versus Statistical Prediction 


we would still have to know the reinforcement properties of these 
stimulus consequences. This in turn involves goals, that is, the 
rather complicated properties (or, better, dimension values) of 
the individual’s needs. In ordinary life, for instance, we do not 
attempt to predict with any confidence what a civilian will learn 
from three years of exposure to the rather homogeneous environ- 
ment of military life, without having a little knowledge of his 
civilian personality. Whatever more permanent second- and 
higher-order dispositions are involved, they constitute a sort of 
stable structure in which particular learnings occur. 

4 second sense in which the structure notion applies is that of 
levels or layers. We observe that A, who ordinarily approves 0 
B, becomes touchy, moody, and irritable when he goes shopping 
with B. This we explain by showing that A’s conduct toward 
salespeople makes B feel inferior, since B is unable to avoid buy- 
ing things that he does not want from an aggressive salesperson. 
Why is this? We explain this fact in terms of a general disposition 
to be passive, overly compliant, and generally fearful of arousing 
the antagonism of others. Why does he have this characteristic 
We explain this in terms of an overlearned reaction-formation 
against his own hostility, the strength of which is maintained by 
ts anxtety-reducing properties. How was this learned? We look 
to his life history to find out why early manifestations of hos" 
tility were more anxiety-arousing in him than in other people 
In terms of the historical sequence of the successive acquisition 
of members of this response chain, in terms of the (truncated) 
sequence of response dispositions at present, and finally in terms 
of the degree of defense against recognition (“depth”) of the 
components of the sequence, we CF 
speak of “layers” of the 

Dr. David Grant h 
predictions about h 


there is a sense in which 
personality and, hence, of a structure. 

As suggested to me that even though certain 
uman individuals may be based upon rela” 
tively complex hypotheses concerning stnebire in that sense’ 
and hence are not derivable from the data by a direct applicatior 
of Probability calculus unmediated by the intervening SteP io 
hypothesis formation, nevertheless the formulation of the hypoth" 


62 


Logic of Clinical Activity 

esis itself theoretically can be handled in terms of Bayes’ 
Theorem. I have no wish to be dogmatic on the point but I am 
not persuaded that this is the case. In order to apply Bayes’ 
Theorem, it is necessary that we should have before us a set of 
alternative conditions, for each of which an initial probability is 
known, and upon each of which the probability of a certain sign 
Or symptom is also known. How is the set to be specified in the 
Personality case? I leave out entirely the pragmatic question, 
Whether we have even approximations to the actual probability. 
It is not clear to me what this distribution of generating alterna- 
tives could consist of. 

For each individual we have, in principle, a set of structural 
hypotheses concerning his unique organization of needs and 
habits. I find it difficult to imagine what hypothesis about per- 
Sonality, or about a segment of personality, corresponds to the 
Bayes urns in this case. It is true that some kind of inductive evi- 
dence must be the basis of deciding that a given hypothesis about 
the Personality structure probabilistically entails some part of 
the evidence we have before us, but I have tried to make clear 
that this much of reliance on previous experiences is not pre- 
cluded by Allport’s views or so far as I am aware by those of 
Anyone else. But it seems to me that the formulating of this 
hypothesis amounts to the hypothesizing of a new urn, with a 
Certain distribution of marbles in it. And I fear that the formu- 
lating of this hypothesis, when it acquires any appreciable de- 
ree of complexity, is precisely that creative act which is possible 
Only for the clinician. You cannot apply Bayes’ Theorem to a 
Problem until you have specified the initial conditions; and this 
Means to state what are the various urns, and what are their 
Contents. Tf we reject a categorical analysis and recognize that 
We deal not with response and stimulus and need classes but 
Tather with clusters, the elements of which differ on a whole set of 

hensions, we have then the continuous form of Bayes’ problem. 

I Suppose one must admit that in principle, perhaps in a “be- 

‘AViorism” stated at the micro-level, the procedure could be car- 
ried through. But we are so very far from even approaching such 


63 


Clinical versus Statistical Prediction 


a situation that the direct synthesizing of such a EE bo 
is, so to speak, merely suggested by the data is the nd 
dure applicable in practice. To return to the analogy lo) ion 
nal investigation, it seems to me that Dr. Grant’s oe thei 
like saying it is not necessary to make use of the yal 
forming skill of a police detective, since presumably the Pp 1 
location of a person is distributed according to some as Hl 
known probability and the likelihood of his behaving ONE 
tain way in a specified situation also has some definite al 0 
unknown probability. Both of these statements are corr a 
course. But in the case of a particular murder, what is TE 3 
somebody who will think up a specific hypothesis EE an 
event sequence, that hypothesis not being constructible { টী 
mechanical rules for combining ‘the “distribution of peop al 
space-time”; although once the hypothesis has been We Se 
certainly nomothetic laws about behavior, properties [oy i 
instruments, and the like are utilized to show that it is enn রর 
in such and such a degree. In the same way, a statistical ana VY ffi- 
of a distribution of frequencies of numbers of cogwheels, a 
cients of friction, and the like would be of some, but ee % 
help in attempting to invent a hypothesis concerning the 5 jt is 
boxes in our mechanical example. No one is denying that hich 
Precisely these distributions of occurrences in his own past be 5 
have eventuated in the skill of the clinician. But this 1s ane 
tantamount to saying that a nonclinician could create the He 0 
hypothesis the clinician creates, by a mechanical treatmen 
the distribution frequencies, even if known. f the 
The relation of lawfulness and uniqueness to the problem Bee 
clinician’s contribution might be put in summary fashion 5 5 
A law, such as the law relating habit strength to nue 
reinforcements, is (1) in its form, nomothetic for a given গর i 
(or an even larger biological group); (2) in its parameters, no 
graphic but perhaps inferable, on the basis of second-level at 
thetic laws, from other parameters estimated on the given Per one 
(8) in its end terms, ie., in the defining properties or dimens! le, 
ranges that specify “S,”» “R.” “G,» “D,” ete., strongly idiograP 


64 


Logic of Clinical Activity 

Since the history generating (3) is precisely what we do not 
know when confronted with the patient, the clinician must re- 
construct it, and from fragments chiefly on the dependent-varia- 
ble side. 

Philosophers of science usually distinguish between “general 
hypotheses” and “particular hypotheses.” The first are exempli- 
fied in such hypotheses as that of universal gravitation, the 
atomic theory, and the kinetic theory of gases. The latter refer 
to hypotheses concerning the state of affairs in a given space- 
time region; as, for example, that the American Indian came to 
this continent from Asia, or that Bruno Hauptmann was the 
murderer of the Lindbergh baby, or that the solar system was 
formed by a passing star. The setting up of hypotheses of the 
first type involves a special creative act in which the scientist 
has to “see” that the facts e at hand could be deduced from the 

Ypothesis h. Presumably the difficulty of this seeing would be in 
Considerable degree dependent upon the similarity of the hypoth- 
esized entity or process to things already familiar. From the 
methodological point of view, the formation of particular hypoth- 
BEES is a different sort of thing; but seen psychologically, it 
might be said that where the variables are extraordinarily com- 
Plicated, and knowledge relatively scanty, the psychology of the 

Ypothesis-forming act may be rather similar in the particular 
and in the general type. What I am suggesting is that high-level 
dlinical hypothesizing partakes to some degree of that kind of 
Psychological process which is involved in the creation of scien- 
tific theory. Tt is from this point of view that one can do justice 
to the intuitive and nonrational element of clinical work without 
committing oneself to any unscientific heresy. For example, ana- 
Ysts have spoken of the “resonance” of the therapist’s uncon- 
‘lous with that of the patient. 

Freud says: “Expressed in a formula, he must bend his own 
su eOnscious like a receptive organ towards the emerging uncon- 

‘lous of the patient, be as the receiver of the telephone to the 

‘SC? (“Recommendations for Physicians on the Psycho-Analytic 
Methog of Treatment,” Collected Papers, IT, 328) . Reik has indi- 


65 


Clinical versus Statistical Prediction 


cated a similar thing in his use of the term “conjectures” (83). 
(See also Fenichel, 45, p. 5.) The important point here is to realize 
that what these authors are discussing comes under the heading 
of Reichenbach’s “context of discovery.” Having once conceived 
a particular hypothesis concerning a patient, we must, if we are 
scientific (I should be inclined to say even rational), subject this 
hypothesis to the usual canons of inference. That is, we must see 
whether the hypothesis will entail more of the known facts than 
others, a greater range or diversity of the known facts, will en- 
able us to make predictions that will square with general prin- 
ciples arrived at by previous inductions, can be fitted into the 
nomothetic scheme at the next lower level in the explanatory 
hierarchy, and so on. This is Reik’s “comprehension.” I do not 
see how any honest clinician can avoid answering these questions 
about his own hypotheses. But probably what has led some 
clinicians to tallc as if they did not accept the usual principles of 
justification has been the failure of some nonclinical critics to do 
Justice to the complexity and subtleties of the preliminary stage, 
l.e., of the events occurring in the context of discovery- As 
Fenichel says, the difference between psychoanalysis and the 
other Sciences with regard to the role played by the unconscious 
is a quantitative one. When we ask “How did clinician A arrive 
at hypothesis h?”? we are asking a psychological question, 2 

We are talking about events which must not be dealt with in # 
simple-minded fashion if the Psychology of the creative act is ৰ 
be unraveled. When, on the other hand, we ask “How could be 
hypothesis h be justified (by clinician A or by anyone else) 2 
rational activity?” we are asking a logical question in the poe 
text of justification. Clinicians and their scientific critics are ofter 
at loggerheads because the contexts of discovery and justification 
ন» kept distinct in conversations on clinical activity. 
i ৰ to mention a few analogical cases in order 0 iene 
render, Tho pot tat may be felt by any actuarially ind 
1 ie clinical Psychology is in this respect j 
টী OgY, engineering, or any other applied it Ke 
Ject matter. If we ask the opinion of an expert engineer concer 


66 


ni" 


Logic of Clinical Activity 

Ing why a certain bridge has collapsed, it is obvious that he makes 
use of certain general principles of mechanics, and in that sense 
he is Proceeding actuarially. It is also obvious that he makes use 
of his experience with collapsed bridges which is, in a sense, also 
actuarial. Nevertheless, an engineer under these circumstances 
does not sit down with a table of relative frequencies of bridges 
of this and that sort, built in this and that circumstance. Having 
gathered the facts, he attempts to state a hypothesis which may 
involve the assumption of a state of affairs which has not arisen 
with any previous bridge that he has studied; conceivably, even, 
Which has never existed with respect to any bridge in the world 
before. Admittedly, his choice of hypotheses will be determined 
In part by certain vaguely known initial probabilities, e.g., he 
May try to avoid a hypothesis which involves as one component 
the Assumption that a certain type of metal was badly cast, be- 
Cause he knows that this sort of thing hardly ever happens. The 
dlinician should be willing to admit that he could hardly fail to 
Sain by having comparable statements of his working assump- 
tions made numerically explicit. That part of the clinician’s 
thinking Which involves the use of empirical frequencies could 
Not fail to be improved by having those frequencies objectively 
determined in a table rather than subjectively stored up in his 
Skull. Where the actuarially minded critic is in danger of going 
stray is in inferring too much from the “obvious” superiority of 
sp Pct relative frequencies over vaguely apprehended trends, 
hinking that the combining of frequencies is a full account of the 
Process of prediction even in those instances where a particular 
Structural Or historical hypothesis is utilized in making the pre- 
ction, The engineer will surely make fewer mistakes if he has a 
ndbook giving the range of tensile strength of various alloys 
An if he comes to his hypothesis-creating with vague and par- 
ally erroneous judgments on these matters. But whether the 
distribution of tensile strength is known impressionistically or in 
ctms of ap explicit frequency table—in either case some of the 


avai - 
ailable hypotheses will not occur as mechanical consequences 
1s data. 


ti 


67 


Remarks on Clinical Intuition 


; a 
Aisnovan it is not the primary issue, I should like to eh. 
‘few remarks here concerning clinical intuition. In SUNN fe 
statistical method in the clinical setting, we often hear tha co 
conflict is between mathematical and so-called intuitive Ea do 
dures. Although we clinicians talk a good deal about this, W' 1 
not know very much about it and there does not seem to be UE 
American investigation of it with the exception of some © A 
older work of Allport and his students. Without attempting 
review what experimental material we have available, it A 
profitable to say a little bit from the armchair. It seems to ibe 
from observations of my own clinical activity and that of 0 


ছি Pt covers 
Workers, that the phrase “clinical intuition” commonly 
two rather different situations. 


The first, and the one 
cians, is the situation in 
nostic, predictive, or po. 
and when asked for the 
tuitively that such -: 


iS 
me...” or “IT dont know, but I feel very strongly about by, 
patient that . . » or ‘He gives me kind of a schizy et 
“IT think that if one has seen very many psychopaths of t 5 
type, he cannot fail to see it in this patient,” or, a somen can- 
sophisticated and even apologetic statement, “I am sum El 
LOC make the cues explicit, but 1 think that this patient 1s - * 


68 


Which seems most irritating to none od 
Which a clinician responds with a tient 
stdictive statement about the jn- 
evidence states simply that he feels 


ore 


B tells 
and-such is the case. “My third ear 


Remarks on Clinical Intuition 


I am sure that most of us will admit that even when we say this 
Sort of thing confidently, perhaps on the basis of having checked 
Up on our guesses in the past, it is a somewhat unsatisfactory 
state of affairs. It would be desirable, not only from the stand- 
Point of teaching clinical psychology, but on the basis of the gen- 
eral advantages of making everything explicit, to be able to ver- 
balize the basis of one’s intuitive responses. Our research ought 
to be directed to the making explicit of such cues by devices of 
Slow-motion photography, the application of group judgments, 
the graphical and quantitative study of gesture and verbal pat- 
terns of patients correctly versus incorrectly identified intuitively, 
and the like. However, it is easy to make too much of a mystery 
of this business, whether one is contented with it or antagonistic. 
A I think that one of the difficulties lies in the implicit assump- 
tion that one ought “naturally” to be able to verbalize the basis 
of his response, and that the cases of inability to do so are rare 
and constitute some type of paradox. It seems to me that this 
'S a mistake. There is no theoretical reason why the organism in 
tesponding appropriately ought automatically to be able to emit 
t ‘e Verbalization which characterizes the physical situation con- 
Stituting the stimulus basis. I do not mean here to distinguish 
etween the clinician’s verbal and nonverbal behavior, which is 
& common division in such discussion; actually the intuitive re- 
Sponse itself is generally verbal in nature, i.e., a diagnostic or 
Prognostic remark. The point is that some movement on the part 
of the Patient may have become a discriminative stimulus for a 
pertain Predictive response, i.e., “this patient will lose his amia- 
ility when you begin to get into his psychological problem,” and 
‘te is no principle of learning with which I am familiar that 
plies high strength for a verbal response such as “I make this 
Wediction because he showed such and such a movement.” The 
erbal Tesponses which are themselves descriptive of the stimulus 
ld are learned over and above other responses (including ver- 
te Ponses) not so descriptive but appropriate in some sense 
Ee er and hence reinforced. The same is true of many responses 
‘volving personal interaction. 


69 


i 


Clinical versus Statistical Prediction 


It is a truism that there is a great difference between being 
able to tell somebody how to do something and doing it oneself. 
Failure to recognize this leads to a feeling that there is some- 
thing unique or peculiar about clinical intuition which requires 
special assumptions and explanations in order to avoid sounding 
mystical. Thus, we hear talk of the clinician responding to the 
“subliminal” or “minimal”? cues. I confess I find it difficult to 
imagine very much clinical response based upon cues which are 
subliminal, and I think such assumptions are quite unnecessary. 
How “minimal” the cues are is a matter for experimental study, 
but I see no reason for assuming that they are any more minimal 
than most of the cues which we respond to in ordinary life. When 
One tries to analyze his own clinical intuition, and succeeds in 
making explicit the basis of such responses, it frequently turns 
out to be nothing more than a matter of paying sufficient atten- 
tion to a kind of behavior on the side of the patient which is quite 
grOss in extent and intensity, and would not be entitled to the 
term “minimal” in the ordinary perceptual sense (11, 53) - It 
would be surprising if such an important set of discriminative 
stimuli as the expressions, gestures, inflections, and postures © 
other human organisms did not become very finely discriminate 
in their control over our behavior. But it would be equally sur 
prising if, in the absence of explicit formal instruction of the Dale 
Carnegie or successful salesman type, there should be set up (in 
addition) a set of verbal responses descriptive of the cue basis. 
Once this is seen, there ceases to be anything special or DELL 
doxical about the obvious fact of this sort of clinical intuitions 
and we have nothing to argue about except questions which are 
settled by specific experiments. What sensory modalities are most 
important, what individual differences exist among clinicians, 
what are the personality or historical correlates of such individu® 
differences, what kinds of intuitive predictions are likely to have 
the highest validity — these are among the many questions to 
Investigated in detail in the experimental study of this process: 

A second, although less common, use of the phrase «clinical In 
tuition” does not involve any reference to the verbalizability b 


70 


Remarks on Clinical Intuition 


stimuli coming from the patient, but simply confesses an in- 
ability to show in what manner a particular hypothesis was 
ar7ived at from the stated evidence. In this also I see nothing 
mysterious or paradoxical. What we seem to be asking for here 
1S a sort of rule or recipe for the creative act of hypothesis- 
formation; and when we cannot formulate one but find hypotheses 
Presenting themselves to our consciousness nevertheless, we feel 
Somewhat disturbed. Once having conceived a hypothesis about 
the Patient, we are not often troubled by any difficulty in show- 
In how this hypothesis is related to certain facts. It is true that 
in explicating this relationship we make use of inadequately con- 
firmed general principles, but this does not introduce anything 
different in principle from what occurs in the physical sciences 
OF in the hypothesizing of ordinary life. Let me illustrate by a 
Concrete example. 

A patient tells a dream which begins as follows: “I was in the 
basement of my parents’ house, back home. It seems that I was 
toning, and a fellow whom I had not seen since junior high school, 
and whom I never went out with, and hardly knew, had brought 
Some shirts over for me to iron for him. I felt vaguely resentful 
about this—oh, and by the way, he was dressed in a riding habit, 
of all things” (grinning). Now, this patient had said in the pre- 
ceding interview that it would be too easy to get into the habit 
2 having sexual relationships with her present boy friend, and 
tb at since she did not really care a great deal about him, she must 
hd avoid this. If the phrase “riding habit” is a sexual pun, bs 
He that the adolescent acquaintance whom she “hardly knew 

sents her present friend in the dream. The remainder of the 
সা and her associations to it, which I will not reproduce here, 
tmed this hypothesis. 
uch moment-to-moment “predictions” during the course of an 
el are made by all clinicians who use any sort of inter- 
re therapy. Of course, we know little about their success 
in die Or the reliance which ought to be placed upon them 
of Soh the interview’s course. But the validity and utility 
Prediction is not the point here. The important thing is 


int 
|) 


71 


Clinical versus Statistical Prediction 


that a description of someone’s clothing in a dream would only 
rarely constitute a pun, and that the punlike character in the 
present instance becomes apparent only when we have in mind 
the particular situation in the patient’s sexual life and her way 
of speaking about it, from the previous interview. Since one can- 
not keep constantly in mind everything the patient ever said, 
What is required is that the verbal stimulus “riding habit” in close 
temporal contiguity to verbalizations of a vaguely resentful sort, 
and presumably the vague awareness on the part of the listener 
that the identity of the old acquaintance is something needing 
to be clarified, combine to produce an association to the phrase. 
As Reik has emphasized, it would be difficult to write a prescrip- 
tion telling anyone how to “have such associations.” In the con- 


SS, once hav- 
a hypothesis, 
ion it is con- 
r particularly 
ts in question 
have thought it up. One 
these: in dreams abstract 
rete forms and processes 
Plastic representation in- 
Pparel in the situation of 
tion, as does its insertion 


Remarks on Clinical Intuition 


plastic representation for such an abstract notion as habit, and 
Wwe have the present result. 

Sarason (84), in an excellent article on the interpretation of 
the TAT, has discussed the question of intuitive inferences in the 
case of this instrument. On the whole, I think his treatment is 
admirable, but it seems to me one might still carry from it the 
implication that ultimately clinical procedure will be irrational 
unless the steps of hypothesis formation are explicated. Sarason 
does not actually say this, but the general tenor of his treatment 
might imply it to some readers. It is a mistake to equate rational 
Predictions to mathematical-mechanical predictions, which makes, 
for example, scientific crime detection irrational because it does 
not proceed explicitly actuarially. So it seems to me it is danger- 
Ous to require that in the process of hypothesis creation, i.e., in 
the context of discovery, a set of rules or principles (recipes, for 
example) is a necessary condition for rationality. What should be 
Tequired is that a hypothesis, once formulated, should be related 
to the facts in an explicit although perhaps very probabilistic 
Way. But to come to the hypothesis may require special psycho- 
logical dispositions on the part of the clinician which are only 
acquired by experience superimposed upon what may or may not 
be a fundamental personal talent. The teachability of such a 
general hypothesis-forming disposition is, of course, an important 
Problem which has hardly been investigated at all. 

Let me conclude these speculations and emphasize their out- 
Come with an examination of Sarbin’s paper, “Clinical Psychology 
Art or Science?” (85). It must be obvious by now that I am 
Sympathetic to Sarbin’s point of view, in that I should like to see 
clinical psychology become as scientific as possible and am im- 
Patient with those who appear to revel in its irrational compo- 
tents. There are a few clinicians who pay lip-service to the future 
Scientific status of clinical work, add sadly that “unfortunately” 
at the present time it is not in this Utopian condition, and then 
Show by most of their off-guard behavior that if that Utopia 
should miraculously be brought about in our generation they 
Would probably abandon the field and pursue other interests 


73 


Clinical versus Statistical Prediction 


more in harmony with their motivational structure. But for clini- 
cal psychologists who, in spite of possessing and respecting clini- 
cal know-how, nevertheless are genuinely committed to making 
the enterprise as scientific as its subject matter permits, it is 
important not to become impatient with the scientific because its 
more passionate proponents make mistakes. It is in the hope of 
avoiding this consequence that I am spending so much time upon 


a detailed analysis of the Sarbin-Lundberg position. 
Sarbin says: 


The present author agrees with Lundberg in that useful diag- 
noses always proceed from generalizations, whether based on a 
Tigorous statistical method or upon a crude empirical method 
Which has been variously named intuition, insight, verstehen, etc. 
When a clinician is put to the test to defend a diagnosis, he may 
resort to the statement that it was “the general feel of things” in 
the interview that influenced him. By pushing him back, how- 
ever, it is possible usually to discover the empirical basis for the 
diagnosis. That these inferences are informal and not made with 
the benefit of Hollerith cards and Monroe calculators is beside the 
point. They are drawn from the clinician’s cumulative experience. 
If they are not, then the diagnostic function must be relegated to 
individuals with some sort of magical power. “Thus the only pos- 
sible question as to the relative value of the case (or clinical) 
method resolves itself into a question as to whether the classifica- 
tion of, and generalization from, the data shall be carried on by 
the informal, qualitative, and subjective method. . . . or the 


systematic, quantitative, and Objective procedure of the statisti- 
cal method” [citing Lundberg] 

At this point the critic will hold up his hand and bid us go no 
further: all that You say is true, he tells Us, if you accept the 
postulate that clinical Psychology is a science. . . . The clinical 
psychologist u 


১g ses those scientific findings and techniques which 
are applicable to his 


E f clinical problems . . . Then even while he 
is developing Such a c 


omplete personality study, h ages upon 
the genuinely artistic tasl of ন Ede 


i helping the patient his own 

problem. [italics added]... . Ping the patient to solve 
This expression, genuinely artistic task — without further defi- 
nition— leads us into a morass. The possible meanings for the 
oo fy ন tie as used here are: (a) skill in the use of 
ool!s; Individual e: Jorati . ন c 
sion of a unique t. xp Ons into the unknown; (c) posses 


alent or gift; (d) so-called intuitive operations. 
74 


Remarks on Clinical Intuition 


(a) If art means the skillful use of tools, then we must ask, 
Whence come these skills? It is unnecessary to elaborate on the 
point that skills are acquired from experience with tools. For ex- 
ample, if a clinician can make ingenious predictions of social ad- 
Justment from the perusal of certain psychological tests, he would 
be demonstrating his skill. Such predictions are obviously made 
against a background of previous experience with psychological 
tests and social behavior. With this conception, the writer has no 
quarrel. It does not postulate a super-empirical method of under- 
standing. It is not, therefore, a material departure from the 
Proposition that clinical psychology is scientific in that predic- 
tions are made on the basis of empirical data. 

(b) Tf art means individual explorations into the unknown, we 
have no way of checking on the validity of predictions formu- 
lated in the name of art. If a clinician should make a diagnosis 
and prescribe treatment for a case that was unique, idiosyncratic, 
In every conceivable way, he would be venturing into the un- 
known. He would be guessing. This would be an expression of 
Personal taste. If the clinician had no experiential background, 
no knowledge of similar cases, then he would be making a truly 
Individual prediction. Unless such a single prediction is ordered 
to a class of events, it cannot be verified and is, therefore, mean- 
Ingless. 

(ce) Tf, in this context, art means the possession of a gift or 
talent for “making friends and influencing people,” then we can 
look for little progress in the field of clinical psychology. If clini- 
cal psychology is an art because some clinicians possess unique 
traits, and if complex human problems can be solved only by 
these specially-gifted people, then we must agree with Rogers 
and “admit that we can never deal in any large way with the 
multitude of ills which we group together as conduct problems, 
Since the talents of the artist can be little conveyed to his fel- 
OWws.” . . . Recognition of this problem is also given in one 
of the most provocative books to be published recently on per- 
Sonnel administration. Roethlisberger and Dickson make this 
Seneralization on the basis of the outcome of a thoroughgoing re- 
Search program in personnel administration: 

“The skill (of diagnosing human situations) should be ‘ex- 
Plicit’ because the implicit or intuitive skills in handling human 
Problems which successful administrators . . . possess are not 
Capable of being communicated and transmitted. They are the 
Peculiar property of the person who exercises them; they leave 
Wien the executive leaves the organization. An ‘explicit’ skill, on 


75 


Clinical versus Statistical Prediction 


the other hand, is capable of being refined and taught and com- 
municated to others.” ... 


In this connection, it should be pointed out that the so-called 
art of interviewing, long considered an implicit or intuitive skill, 
has recently been studied, refined, and communicated to others. 
Porter . . . and Bordin and Sarbin . . . have studies in prog- 
ress which show how these so-called artistic skills may be taught 
and learned. 

(d) If art means some super-empirical method of understand- 
ing, then we must surrender our ideas about communicating 
techniques and procedures in clinical psychology. If we depart 
from the method of logical inference, i.e., the scientific method, 
then we must perforce adopt some so-called intuitive approach. 
Not inductive, not based on logical inference, the intuitive 
method of understanding is described by Klein as follows: 

“... (itis) the task of fathoming human motives or appre- 
ciating the entire gamut of human desires... (it) requires a 
knowledge of human nature. It represents the type of understand- 
ing indispensable for the development of psychology as a social 
Science or as a Geisteswissenschaft.> . .. 

The traditional methods of Science, he points out, have a place 


in Psychology, but the intuitive approach, characterized by the 
SEAN above, is to reap the harvest in psychology. (85, PP: 
8395-97. 


Let us consider Sarbin’s fourfold classification of 
artistic tasks” in the light of our previous discussion. Meaning 
(a), the skillful use of tools, does not produce any disagreement 
from Sarbin so there is little to say about it. It is, however, neces- 
sary to be aware of the fact that When Sarbin says “such predic- 
tions are obviously made against a background of previous ex- 


perience,” he is not Proving that the prediction is actuarial in the 


Narrower sense, nor does it follow from his general statement that 
if the clinician’s prediction 


is, formalized in 


“genuinely 


based upon experience,” 
stitute for a Hollerith m. 
be no basis for disagree 


and “the clinician is a second-rate sub- 
achine”—is maintained, there seems to 
ment in Sarbin’s treatment of (a). 


76 


Remarks on Clinical Intuition 


“(b) If art means individual explorations into the unknown, 
we have no way of checking on the validity of the predictions 
formulated in the name of art.” Tf read literally, this assertion is 
simply incorrect. As I have tried to indicate in the discussion of 
Jones’ suicide, the validity of an artistically arrived at idiographic 
prediction is checked in the same way that any other prediction 
is checked —by waiting around to see whether or not it occurs. 
It is somewhat surprising that Sarbin should make this mistake, 
since his thought is obviously heavily influenced by the work of 
Reichenbach, who discusses at some length the problem of pre- 
diction in a disorderly world in which there appear to be no stable 
relative frequencies but in which a certain clairvoyant is able to 
anticipate the future. The primacy of his general inductive prin- 
ciple is established by making clear that even in such a world 
there is one relative frequency which is not completely unlawful, 
i.e., the class of the clairvoyant’s predictions. In such a world we 
Would predict our futures by making use of the clairvoyant, but 
it is foolish to do this until we have established that such a pro- 
cedure “pays off,” and this means an application of the funda- 
mental rule of induction to the clairvoyant himself. 

I do not mean to suggest that the clinician is not behaving on 
the basis of previously established frequencies and complex ways 
of combining them, but the point here is that even were there 
Such a thing as a clinical clairvoyant, involving a genuinely 
extra-mundane or supra-empirical basis of arriving at clinical 
Predictions, the predictions of such a clinician could be confirmed 
Or disconfirmed in the usual way, and it could also be decided 
What degree of confidence we should have in his predictions. 
Furthermore, it could be determined whether subclasses of the 
Set of all his clinical predictions differed significantly with re- 
Spect to their relative success frequencies. Such a finding would 
lead us to place greater faith in him when he is predicting cer- 
tain kinds of events than others. It might even be shown that 
Although his predictions do not appear to be based upon any of 

€ facts available to him, so that sometimes he predicts success 
in the Presence of an alcoholic foster father and at other times, 


77 


Clinical versus Statistical Prediction 


everything else apparently being equal, he predicts failure under 
similar circumstances—it might nevertheless appear that, what- 
ever he predicts for the subclass of cases involving an alcoholic 
foster father, his success frequency is extremely high. We cannot 
agree with Sarbin’s statement that “unless such a single predic- 
tion is ordered to a class of events, it cannot be verified and is, 
therefore, meaningless” (85, p. 396). Nor can we agree that “Tf 
@ clinician should make a diagnosis and prescribe treatment for 
a case that was unique, idiosyncratic, in every conceivable way, 
he would be venturing into the unknown. He would be guessing. 
This would be an expression of personal taste” (85, pp. 395-96) . 
I do not suppose that any clinician imagines that he deals with 
patients who are completely unique and idiosyncratic, if by 
“completely” is meant that there are no similarities! But even if 
there were such a clinician, Sarbin would not be entitled to equate 
such a wholly idiographic procedure to “guesswork” or “taste.” 
Sarbin’s mistake here consists in equating the nondeductive or 
nonformal with the irrational. 
Suppose a clinician should come upon a fantastic organism 
which, although behaving lawfully, did not behave in accordance 
with any of the psychological laws of organisms in the clinician's 
experience. Given a considerable mass of material, still actuarial 
in Lundberg’s sense of involving repeated episodes in its life his- 
tory, the clinician might be led to the construction of a “theory” 
about this individual organism. This “theory” would be defended 
by the clinician on the grounds of its capacity to entail the known 
facts about the individual’s previous and present behavior, and 
would be capable of entailing certain predictions about the future. 
Ro open the question as to whether the prediction will 
16 (which will depend on whether the theory thus con- 

bd eo) » the important point is that the clinical ac- 
sa gi ae wend ration Te wane 
the word ND equtsedot e CEP in the trivial Een i 
5 It to inductive; and even the inductions 

do not apply to Any organism except the present one. The source 
of Sarbin’s difficulty here is his belief that to deal vith novelty 


78 


Remarks on Clinical Intuition 


Wwe must either show that the novelty is merely apparent, or else 
Wwe must have recourse to nonrational methods. 

I am sure that Sarbin does not feel this to be the case in scien- 
tific theories of a general sort, where from time to time in the his- 
tory of science, as for example in the electromagnetic theory, 
Whole aspects of the world began to be investigated which were 
genuinely novel. It is true that the symptoms of the magnetic 
field involved events which were describable in terms of mechan- 
ics, e.g., the deflection of a needle. But the laws and constructs of 
electromagnetic theory were of a different sort. Modern atomic 
Physics has had occasion to introduce many objects and events 
Which bear only the crudest analogical relationship to anything 
Seen in macroscopic experience, and in many cases physicists 
have had to endow certain subatomic events and processes with 
Characteristics that do considerable violence to our ordinary con- 
ceptions. No one doubts seriously the capacity of the human in- 
telligence to make sense out of a fairly complex set of observa- 
tions, even when the processes and laws involved are new. But 
here again what is involved is a capacity to invent such theories, 
and we have moved outside the province of formal logic. In the 
formal disciplines, the logician can tell us almost all we have to 
know about how to make inferences, and can make clear their 
logical structure. In the empirical field, he is barely beginning to 
teconstruct the basis of confirmation of hypotheses, using as a 
model even the simplest kind of world; and there is at present no 
hint that he will ever be able to tell us how to make up the sen- 
tences which are confirmed by certain evidence. In one restricted 
Sense of the word, it must be admitted that all empirical hypoth- 
esis-making is nonrational, in the sense that explicit instructions 
for creating hypotheses cannot be stated. But this is surely not 
& use of the word “nonrational” which is important here. 

ib (ce) If, in this context, art means the possession of a gift or 
talent for ‘making friends and influencing people,’ then we can 
look for Tittle progress in the field of clinical psychology.” I am 
Rt & loss to understand how Sarbin arrived at this statement. If 
high level operations in clinical psychology depend upon certain 


79 


Clinical versus Statistical Prediction 


special human traits, then progress in clinical psychology will 
obviously be furthered by the use of suitable methods of selec- 
tion for those traits. If we were to be forced to the conclusion that 
it is impossible for the actual day-by-day operations of the clini- 
cal psychologist to become explicitly scientific, we could still, as 
scientific personnel men, set up procedures for the selection of 
students on the basis of these talents. To carry the argument 
further, it might be discovered that only “skilled, intuitive clini- 
cians” could detect the characteristics of “skilled, intuitive clini- 
cians.” Even this would not discourage us with respect to im- 
proving the status of clinical Psychology, since these clinicians- 
for-selecting-clinicians can themselves be investigated as we 
investigate Reichenbach’s clairvoyant. Somewhere along the line, 
in terms of some kind of rating, outcome, or mixture of human 
Judgments, we will arrive at a Place where everybody, intuition- 
ists and statisticians alike, will agree we have to lay our cards on 
the table. Only practical difficulties, but nothing in principle, 
should lead to Sarbin’s Pessimistic conclusion from his premise. 
The advantages of being able to make certain skills explicit from 
the standpoint of teachability, and the other desirable conse- 
quences of having our knowledge communicable, are, of course, 
not to be denied. 
Probably there will always be aspects of an individual’s be- 
havior which are relatively unteachable and which contribute 
materially to his clinical functioning. Certain talents for rapport- 
Setting probably depend in part upon characteristics of features, 
Ze build, Voice, gesture, and choice of Words, facial expression, 
and the like. Unless Sarbin believes in the infinite plasticity of 
adult human organisms, he should allow the possibility that there 
are combinations of personal traits in a would-be clinician which 
PE inept at some kinds of clinical activity. Even if 
all there is to know about the dynamics of the 
Paine CY, we wr oun prov toe gh sta 
depends not merel oS Oe sblity OT te, 
but also on other L OU BE hove comE 
Spects of our nature. Much clinical work in 


80 


Remarks on Clinical Intuition 


Volves activity in addition to comprehension. I may decide (even 
actuarially!) that the patient needs a dominant, even stern re- 
action from me, at this moment. Can I exhibit one? 

The matter of timing is also important in this connection. Sup- 
Pose, for example, that statistical studies of a factor analytic 
type (P technique) should show that a certain way of speaking, 
and even a particular choice of words, is associated with a 
patient's hostility to a sister. Hostility to sisters is something seen 
in many patients, but the tie-up between that nomothetic char- 
acteristic and the special peculiarities of this patient’s language 
is idiographic, having arisen on the basis of a unique set of un- 
Usual experiences. In order to prove that this association exists, 
it may be necessary to carry out a very long and complex kind of 
Statistical analysis on the verbal protocols of the individual case. 
There is no alternative to this, in the context of justification; 
that is to say, when the clinician says “every time he talks this 
Way, no matter what the content of his conversation is, I know 
that I hear unconscious material dealing with his hostile atti- 
tudes toward his sister,” he has to prove it somehow. In the thera- 
Peutic handling of the case, it is impossible for the clinician to get 
Up in the middle of an interview, saying to the patient, “Leave 
Yourself in suspended animation for 48 hours. Before I respond 
to your last remark, it is necessary for me to do some work on my 
calculating machine.” And I do not think that this absurd illus- 
tration arises only because of our limitations of knowledge. 

IfI may be permitted another analogy, consider the case of the 
Skilled baseball Player. There is not much concerning the mathe- 
matical ballistics of the baseball, or the physiological and me- 
chanical Principles of locomotion, which are not understood suf- 
ficiently for all practical purposes. But no physicist, physiologist, 
OT psychologist would argue that the writing of the differential 
equation of the baseball’s path, and the analysis of the move- 
Ments of the player’s body in terms of metabolic activity produc- 
ng energy to work on a. complex system of third-class levers, 
Would enable him to be at the right place, at the right time, in 
the right position, to perform the fielder’s function. 


81 


Clinical versus Statistical Prediction 


What I am saying is that even in the Utopian stage of clinical 
psychology, when we have sufficient methods of selecting clini- 
cians and have made explicit all that can be made explicit about 
the psychological principles we use, at the moment of action in 
the clinical interview the appropriateness of the behavior will 
depend in part upon things which are learnable only by a multi- 
plicity of concrete experiences and not by formal didactic ex- 
position. If Sarbin means to include this multiplication of con- 
crete experiences under the heading of “teaching,” we have no 
quarrel with him. But that the existence of certain kinds of be- 
havior and discrimination are the results of such an accumula- 
tion of experiences is precisely what most of us have in mind 
When we refer to the artistry of the individual who is clinically 
skilled. Whether or not there are even biologically given indi- 
vidual differences of certain kinds of potentialities for clinical ob- 
servation and operation we do not know. In the absence of any 
statistical or experimental evidence on this point, I can only 
say that I am appalled by the ability of some students to spend 
2 couple of years in contact with clinical material and with con- 
stant opportunity for interchange with skilled clinicians and to 
retain an incredible blindness for all those clinical signs which 
they have not been specifically told to look for. Not only does the 
meaning of a behavior datum depend in most instances upon # 
half-formulated hypothesis concerning the case at hand, as indi 
cated above in our discussion of the clerical worker; but all of 
this 1s impossible from the beginning unless the practitioner 
Lb the behavior in question. Individual differences in such 
sensitivity, the source of these differences in heredity or in very 
early interpersonal learnings, its modifiability as a result of nor- 


mal lite Or practicum training and the like— all are experimental 
questions which neither Dr. Sarb 


EN in nor TI am in a position to 


With regard to (d) from Sarbin’s article, what has just been 
said is probably sufficient. I am sure my remarks will not be in- 


terpreted to mean that T anticipate or desire that the intuitive 
approach will reap the harvest” in Psychology. 


82 


Empirical Comparisons of Clinical and 
Actuarial Prediction 


F OR some reason the literature contains almost no carefully exe- 
cuted studies of the clinical-actuarial issue. Although a number 
of psychologists, psychiatrists, and sociologists have discussed 
this problem, empirical evidence concerning the relative efficacy 
of the two methods of prediction is largely wanting. I have been 
struck by the fact that both statisticians and clinicians often 
Seem to think the answer is “obvious,” the trouble being that they 
don’t agree on what it is! 

Allport (5) cites what he considers to be evidence for the 
Superiority of the case-study method: “Studies should be made 
Of the relative success of actuarial and case study predictions. If 
Sensitive judges employing adequate documents commonly excel 
In their forecasting, we shall know that actuarial predictions are 
tot the apex of scientific possibility, and shall conclude that the 
Drevailing empirical theory is too meager to apply to the optimum 
level of prediction.” (5, p. 160.) Allport adds in a footnote: “Al- 
teady there seems to be considerable evidence that case study 
Prediction excels. The experiments of Estes, F. H. Allport and 

* Frederiksen, and Polansky are all relevant. To be sure these 
xXperiments are limited in scope; but they can be, and should be, 
eXtended.” (5, p. 160.) 

The three empirical investigations here cited by Allport are 


88 


Clinical versus Statistical Prediction 


interesting and have a tangential connection with the present 
issue. But I do not believe that these studies contribute as much 
to a solution of the empirical problem of clinical and actuarial 
prediction as Allport thinks. My analysis leads me to think that 
these three studies are, in fact, largely irrelevant. Any empirical 
study of actuarial versus nonactuarial predictive techniques 
should involve the making of predictions from similar or identical 
sets of information by the two methods, and a comparison of the 
Success frequency arrived at in these two ways. Obviously, any 
investigation which does not anywhere involve the making of 
predictions upon an actuarial basis cannot make such an empiri- 
cal comparison of predictive efficiency. None of the three studies 

Allport cites involve the making of predictions of an actuarial 
type. Hence, they can have, at most, a feebly supportive role with 
respect to Allport’s major contention. 

Let us begin by a brief consideration of Polansky’s study (80) - 
The essential point of Polansky’s investigation was a comparison 
of the success frequency of predictions made by a group of judges 
on the basis of case histories, each of which had been written in 
Six different ways. These six modes of writing a case history are 
called by Polansky structural analysis, cultural presentation, 
genetic resentation, major maladjustment, the presentation of 
typical episodes, and individual differences (psychometric). The 
life histories Were those of three subjects, and for each subject 
all EI types of life histories were written. Each of 36 judges made 
Predictions twice for each of the three subjects, once by each of 
two methods. The judges were asked to predict, using a 12-item, 
ee questionnaire, 12 factual items about the subject, of 
nee hed nate cnn te Inde 

Te a at the subject did when he was broke, whe 
EN 
Views on Marxism es ye his hobbies, his dress, 2 
EE St ৰ 1S religious beliefs, his vocational cholic 
EE Eb) ated alternatives, and so forth. The three 
subjec ere “three friends of the experimenter . . . similar In 


1 2G basic Cultural background.» Polansky’s control in- 


84 


Empirical Comparisons of Predictions 
vestigation of “cultural chance probability” utilizing Harvard 
College students in combination with the above brief characteri- 
zation of the subjects indicates that these three subjects were 
Harvard College men. This homogeneity will be important in 
our discussion of the study. 

Polansky analyzes his data in several ways, but the most im- 
portant question is the relative predictive power of the six types 
of life histories. Analysis of variance of the percentage of “hits” 
made by the use of the six modes establishes that there is a sig- 
nificant difference among them. The most predictively effective 
mode is the “structural analysis,” which is the kind of descrip- 
tion of a person favored by Allport and his school. The least effi- 
cient method of presenting a life history in terms of predictive 
Success was the mode called “major maladjustment,” which is the 
type of personality description Allport considers typical of the 
usual psychiatric report. There were marked and consistent dif- 
ferences in the subjective responses of the judges in their willing- 
ness to make predictions on the sets of data, in their feeling of 
Understanding, acquaintance with the subject, and the like. For 
Purposes of the present discussion, our concern is the contrast 
between the most efficient mode (Allport’s “structural analysis”) 
and the mode which turned out to be next to the bottom in pre- 
dictive efficiency, namely “individual differences” (psychometric 
mode) . Of the total number of predictions made from the struc- 
tural analysis mode, 47.6 per cent were objectively correct; where- 
as of the total number of predictions made by the psychometric 
mode, only 86.9 per cent were objectively correct. This difference 
In percentage of hits is statistically significant at the 1 per cent 
level. I assume that it is this comparison which Allport considers 
evidence on the subject of clinical and actuarial methods of pre- 
eo I shall make several critical comments on his interpreta- 

ion, 

In the first place, it might be suggested that Polansky loaded 
the dice somewhat against the psychometric prediction by his 
choice of measuring instruments. The administration of the 
Wechsler-Bellevue, the Nebraska Inventory, and the Bernreuter 


85 


Clinical versus Statistical Prediction 


Inventory to three subjects such as Polansky’s could almost be 
considered a waste of psychometric time. It is hardly conceivable, 
for instance, that three Harvard students would be sufficiently 
discriminated as regards intellect by a test with so little top as 
the Wechsler to make the obtained IQ’s of any predictive signifi- 
cance. The evidence for validity in the case of the Nebraska In- 
ventory, the Bernreuter, and the Pressy X-O Test is hardly im- 
pressive enough to warrant us in expecting that much of any- 
thing could be predicted from these three devices. The only two 
tests of the battery which I should be interested in knowing 
about when attempting to make predictions of the sort required 
in this study would be the Lentz Opinionnaire, which is relevant 
to only one of the twelve questions (subject’s attitude to Marx- 
ism); and the Allport-Vernon Study of Values. It should be 
pointed out that the available Psychometric devices which would 
be relevant to the predictions required are very limited in num- 
ber and the validity of many potentially useful instruments is 
not definitively established. It is Probable that Polansky’s judges 
Would have done somewhat better with the Psychometric mode 
had they been using information gleaned from such psychometric 
devices as the Rorschach, the Strong Vocational Interest Blank, 
the Kuder Preference Record, the MMPI, and, assuming that 2 
capacity test would be fruitful in the battery at all, a measure of 
general intelligence more suited to the selection of subjects, €.5. 
2 test of graduate ability such as the Miller Analogies. It should 
also be mentioned that there is no g00d reason for preventing 
Judges in such a predictive situation from having access to the 
actual item responses made by the Subject. The fact that there 
Tlap between such responses and 
d not argue against such a proce- 
me objection could be made to the 
modes. In the case of the structural 


86 


Empirical Comparisons of Predictions 

as to whether the subject has had sexual intercourse. If part of 
the structural analysis involves a statement concerning his vir- 
ginity there is no good reason why the judge should not make 
use of this information. But by the same token, in attempting to 
predict the subject's attitude toward Marxism, it is certainly 
legitimate for the Judge to know that when asked specifically in 
a verbal questionnaire whether he thought highly of Marxism the 
Subject stated flatly that he was against it. 

I do not wish to minimize in the least the terrible difficulties 
encountered in attempting to make concrete behavior predictions 
from psychometric data, even at their best. As a clinician I am 
fully aware of the peculiar feeling of “abstraction” which one gets 
in attempting to characterize a person from a set of test scores. 
I do not suppose that most practicing clinicians would sacrifice 
an hour of direct contact in an interview for any set of psycho- 
metric scores, if compelled to choose, although there are many 
Individual cases in which the tests get at things which we do not 
Set at in the interview. My aim here is to emphasize that Polan- 
sky’s battery would not represent the power of psychometric pro- 
cedures at their best. In the light of these considerations it is 
Important to notice the fact that there was actually only about 
& 10 per cent difference in predictive efficiency between the 
Psychometric mode and the structural analysis mode. It is really 
rather surprising that the judges were able to do as well with the 
Psychometric data as they did! However, the crucial point is 
that the Polansky study does not involve any empirical compari- 
S0n of the actuarial and nonactuarial methods of combining data 
for Predictive purposes. Actually, all the predictions were made 
dlinically; that is, the judges combined the information received 
In whatever manner seemed subjectively most appropriate, in 
the absence of any exact knowledge concerning the statistical 
relationships between this information and the to-be-predicted 

ehavior, and with only the most scanty evidence as to the proba- 
le behavior correlates of the independent variables. In terms of 
the distinctions I have made previously, the Polansky study in- 
Volves a comparison among kinds of data, not among modes of 


87 


Clinical versus Statistical Prediction 


combining data. An actuarial prediction, in the sense at stake in 
this argument, would involve the use of tables showing, for ex- 
ample, the distribution of intercourse frequencies in cells defined 
by certain complex conjunctions of the data. ‘The table (based 
on empirical study of a suitable population such as male Harvard 
undergraduates) would contain a cell for “cases having Bern- 
reuter B1-N scores between 30 and 50, value profiles with eco- 
nomic score as peak and religious low, no siblings, ete.” The fre- 
quency distribution of sexual experiences within this cell would 
make the judge’s prediction a clerical task. Note that here, as 
usual, the kind-of-data dimension cuts across the actuarial- 
judgmental dimension. Nothing remotely resembling such a pro- 
cedure enters into Polansky’s design. Therefore, the relation of 
this study to the main issue is tenuous at best. 

The study by Estes (40) concerns the judgments of personali- 
ties from expressive behavior. The subjects were fifteen of the 
cases studied intensively by Murray in the Explorations, and the 
behavior sample available for Judgments was moving pictures of 
the subjects carrying out simple tasks, such as lighting a match, 
putting on a coat, building a house of cards, and Indian wrestling- 
These findings are of considerable interest to the clinical PsY- 
chologist, but again, we do not have any comparison of clinical 
and actuarial methods of prediction. Here it is even more ob- 
vious than in the Polansky study that all the predictions were 
made nonactuarially. Those judges who had been specifically 
trained in scientific and analytic methods of arriving at their 
opinions, e.g., experimental psychologists, students of the physi 
০০ পে 
SEE ¢ fie 5s of fine arts, dramatics, and s0 for 
Ee EE there is no actuarial basis available to, 200 

© JuCges, they have to be impressionistic; and the best “im 
Pressionists” are those who spend their time doing this kind 0 
thing! 8 
LE Lele here to deal with that sort 9 
EI Ee i synthesizing operation at whic ta 

of trained clinicians is presumably adep 


88 


Empirical Comparisons of Predictions 


If it is true that training in analytic thinking and logical re- 
construction of evidence reduces such skill, it might be desirable 
to avoid clinical training procedures which tend to produce such 
results. The all-important problem of the “clinician as instru- 
ment” is being discussed these days, and we psychologists should 
learn from investigations of the Estes type. But the present ex- 
perimental design simply does not involve a comparison between 
clinical and actuarial modes of combining data, and consequently 
is not suited to Allport’s purposes. 

The study of F. H. Allport and Frederiksen (3) makes use of 
the method of correct matching. A certain dilemma involving 
moral decision is presented to a group of subjects, who are in- 
Structed to write a paragraph predicting the response of 5 of 
their friends. These 5 friends are independently presented with 
the same dilemma, and the problem for the judges is to match 
the actual responses as written by the friends with the predicted 
Tesponses as written by the other subjects. Of the total of 1530 
single matchings made, the investigators found 24.9 per cent 
Correct matchings as contrasted with a chance value of 20 per 
cent. Because of the large N this result is statistically significant. 
However, when one considers that the actual results are only 
4.9 per cent better than chance, it is difficult to see how any par- 
ticular importance can be attached to the results. The statistical 
significance is obtained not because of the high degree of accuracy 
of the judgments but because of the very large number of re- 
Sponses that go into the significance test. In fact, the stability of 
the observed low percentage with such an N allows us to state 
that this sort of matching can be done very poorly! Again, no 
direct comparison of actuarial and nonactuarial methods of pre- 
diction is involved. For purposes of Allport’s argument, this study 
1S also more or less tangential. 

The three studies cited by Allport in his footnote do not con- 
stitute evidence for the superior efficiency of nonactuarial pre- 
diction, I have managed to find twenty studies which do involve 
An empirical comparison of the two techniques, and which can 
Perhaps shed a little light upon the problem. The ideal design is 


89 


Clinical versus Statistical Prediction 


one in which the same basic set of facts is subjected on the one 
hand to the skilled analysis of a trained clinician, and on the 
other hand is subjected to mechanical operations (table entry, 
multiplication by weights, or the like). The predictions arrived 
at by these two methods are then compared with respect to their 
success. In the following investigations this design is BPPIOH 
mated to varying degrees. I do not claim that the following is 2 
complete review of literature, but it represents everything I could 
locate by entering the Psychological Abstracts via an extended 
and diverse list of topic names, plus inquiry among psychologists 
I knew to be interested in the problem. র 

The first systematic investigation aimed deliberately at getting 
an empirical answer to our question was carried out by Sarbin 
(86) . His study has not received anywhere near the attention it 
deserves. It was designed from the start to compare the two 
methods, whereas in most of the other relevant studies that com- 
parison was incidental to some other major research aim. I have 
been repeatedly amazed to hear clinical workers make flat state- 
ments about the answer to Sarbin’s question, only to find that 
they had never heard of the study. 

Sarbin chose as his criterion variable academic success 28 
measured by honor-point ratio. The sample consisted of 162 fresh- 
men (73 men and 89 Women) who matriculated in the fall of 
1989 in the arts college at the University of Minnesota. Honor- 
point ratios were calculated at the end of the first quarter of the 
students’ freshman year. The statistical prediction was made by 
a clerk who simply inserted the values of the predictor variables 
into a two-variable regression equation. The predictor variables 
were high school percentile rank and score on the college aptitude 
test. (Note: One psychometric and one nonpsychometric varia 
ble.) The sample used was cross-validating, since the regression 
equation had been based upon a previous s 

The clinical predictions were m 


one of five clinical counselors in th 
ing Bure 


ample. b 
ade on an cight-point scale i 
e university’s Student Counse 


aU. Four of the five counselors possessed the doctorate 


Nel fh POST 9c HS 
and all had “considerable experience” in clinical counseling 


90 


Empirical Comparisons of Predictions 


university students. The data available to the counselors were 
considerably in excess of those utilized by the statistician, name- 
ly: a preliminary interviewer's notes, scores on the Strong Voca- 
tional Interest Blank, scores on a four-variable structured per- 
sonality inventory, an eight-page individual record form filled 
out by the student, scores on several additional aptitude and 
achievement tests, as well as the two scores utilized by the statis- 
tician. In addition, the predicting clinician had one interview 
with the student prior to the beginning of fall quarter classes. 
At the end of the fall quarter the correlations shown in the tabu- 
lation were obtained between the two sets of predictions and the 
facts. There is no significant difference between the efficiency of 
the two methods. 


Men Women 
CHCA wes cstie Reena .85 .69 
SIRTSEHSML = i 50 2 eas 45 70 


Even though the clinician, utilizing all this additional informa- 
tion, is no better at forecasting than the statistical clerk, Sarbin 
felt that perhaps they were hitting different cases and matters 
Would improve if the clinician were included as a statistical 
Variable. The increment to the multiple R. given by adding his 
judgmental rating as a third variable in the predictive system was 
only .01 for men and .05 for women, neither of these improve- 
ments being significant. Some of the clinicians felt that there 
Was no practical value in the refined cight-point prediction, and 
that they would do better merely being asked to predict success 
Versus failure. Dichotomizing the continuum at an honor-point 
ratio of 1.00 (C average) Sarbin reanalyzed the data in these 
terms. For male students, the statistical and clinical methods 
Were not significantly different in predicting this dichotomy, al- 
though there was a slight trend favoring the statistical; for 
Women, there was borderline significance (.01 < P < .05) in favor 
Of the statistical. When the data for both sexes were pooled, the 
Statistical method was superior to the clinical at between the 1 
and 2 per cent levels of confidence. The over-all magnitude of this 
Superiority was, however, only about 6 per cent hits (my calcula- 


91 


Clinical versus Statistical Prediction 


tion) . It was also shown that the clinicians systematically over- 
predicted grade average (“leniency error”). 

A rather surprising finding, considering the mass of additional 
information they had available, was that the clinicians’ predic- 
tions were significantly more correlated with the two predictor 
variables than the criterion variable was. That is, the clinicians 
overestimated the contribution of the two major predictor varia- 
bles, attributing more criterion variance to them than they in 
fact control. As Sarbin says, “. . . the case-study method takes 
behavior segments with known weights and applies other weights 
Which are less efficient” (86, p. 596) . Both methods systematical- 
ly underestimate the criterion variance, although it is debatable 
whether this should be called “error,” since if the statistical 
method did not do this the mean squared error of estimate would 
necessarily increase. That being so, the corresponding restriction 
of range by the clinician (which Sarbin calls “playing safe”) is 
based upon a sound statistical principle. Sarbin does not present 
data indicating whether individual clinicians did better than the 
regression equation. Reliabilities for the five clinicians based on 
rerating after six months ranged from .64 to .88. It seems quite 
Possible that since the clinical and statistical methods were 50 
nearly equal when the Judgments of all clinicians were pooled, 
One or two could well have been Superior to the regression method. 

Wittman (103) developed a prognosis scale for use with schizo- 
Phrenic patients, consisting of thirty variables rated on the basis 
Of social history (and the Psychiatric examination?). With the 
exception of marital status, all of the vari. 
“Judgmental” in character, and wo 
ments upon the clinical skill 
Objective matters, 


ables were more or less 
uld involve varying require- 
of the rater. They range from semi- 
clive such as duration of psychosis, to highly inter- 
pretive judgments, such as anal erotic versus oral erotic. None of 
the predictive Variables were psychometric. Numerical weights 
Were assigned to the values of these ratings on the basis of the 
“frequency. . . and relative importance ascribed to them in 


more than 50 studies by various authors” (103, p. 21). We may, 


therefore, presume the Weights employed were not optimal and 


92 


Empirical Comparisons of Predictions 


hence not fair tests of the power of actuarial prognosis, since they 
were not determined by an actual statistical analysis of any de- 
fined sample but arose from a crude quantification of impressions 
found in the literature. All ratings were made by the investigator, 
Who rated either prior to the beginning of a patient's therapy or 
by reading the charts (minus progress notes) of old cases. The 
agreement of total score between her ratings and those made by 
another staff member on a small sample (N = 61) was 4-.87, but 
she points out that the marked bimodality found influenced this 
coefficient markedly. 

Independently of the scale, the psychiatric staff had made a 
three-step rating as to prognosis prior to beginning therapy at a 
“diagnostic conference.” It is not clear from the report whether 
the final statistics cited refer to a pooled judgment or not, but it 
is clear that individual staff members’ predictions must have been 
Obtained for study (see below). 

The criterion was a five-step rating made at a therapy staff 
meeting after conclusion of shock treatment. The degree of con- 
tamination of this rating by the psychiatrist's pre-treatment im- 
Pression is not inferable from the report but would presumably 

Percentage Percentage 


Five-Step Criterion ofHitsby of Hits by 
Categories IY Scale Psychiatrists 
Remission as esas 56 90 52 
Much improved ..... 66 86 41 
TA PYOVEd- csawiete acermcs 51 76 36 
Slightly improved ... 81 46 34 
Unimproved ........ 139 85 49 


inflate the percentage of hits for the psychiatric staff. This five- 
Step scale was collapsed to a three-step for comparability with 
the staff clinical judgments, by grouping the two most favorable 
Outcome categories as “greatly improved” and the next two as 
“guarded.” With a “hit” defined as proper placement in this three- 
Step scale, the results were distinctly favorable to the actuarial 
method, as shown in the tabulation. The total hits by the scale 
Were 81 per cent, as contrasted to only 44 per cent by the 


93 


Clinical versus Statistical Prediction 


psychiatric staff. This difference is significant at the P < .001 
level (my calculations) . Wittman states that the individual staff 
members ranged from 8 per cent to 81 per cent hits, although no 
further details are presented. It thus appears that both the typ!- 
cal and the pooled staff judgment were far below the scale in 
predictive efficiency, and that the best staff member just equaled 
the scale. 

In a second study (104) Wittman and Steinberg present fur- 
ther data based upon a larger sample of 960 patients, this time 
including 156 manic-depressives. The bimodality of prognosis 
scale scores for the schizophrenic group is again in evidence. The 
continuous prognosis scale was divided into three intervals as 
before so as to make its predictions comparable to the three-step 
staff ratings. Both staff judgments and Prognosis scale rating 
Were completed from cight months to three years prior to the 
criterion evaluation. In this follow-up study, the superiority of 
the scale method is still very evident but of somewhat lesser de- 
gree. Total staff hits came to 41 Per cent, as contrasted to 68 per 
cent for the scale (P < .001) . Both scale and staff yield highly 
significant chi-squares against the criterion in a nine-fold con- 
tingency table. The contingency coefficient for the scale is .61, 
while that for the psychiatrist is only .21 (my calculations). 
Wittman states that Sarbin revised and shortened the scale and 
that his report on the revision is contained in the same volume 


of the Elgin Papers. This is not true of the volume accessible to 
me, and I have been unable to loc 


Schiedt (90) in a doctoral diss 
that fifteen of Burgess’ factor 
bricty), when combined by 
about as successful in pre 
Bavarian ex- 


ate any such research report. 
certation in jurisprudence showed 
S (e.g., age, marital status, SO- 


actuarial predictions is based 
of Schiedt’s data shows that 


94 


Empirical Comparisons of Predictions 

if the most doubtful group (i.e., cases with p near .5) is excluded 
from the actuarial predictions, we have left a set of cases about 
as numerous as those for whom the physician made a prediction. 
In this subset the actuarial prediction is somewhat superior. Un- 
fortunately, we are given no information by Schiedt as to the 
training and skill of the clinician, who is not called a psychiatrist 
in Schiedt’s published paper but simply a prison Arzt. Incidental- 
ly, it is interesting to note the great concern shown by Schiedt 
lest his German readers might find the actuarial technique ob- 
Jectionable when thus applied to a problem in the prediction of 
human behavior, and his struggle to satisfy himself of the genu- 
ineness of his results in the absence of any knowledge of signifi- 
cance tests. 

Conrad and Satter (83) in predicting the success of naval 
trainees in an electrician’s mate school (N = 3500) compared 
the predictions of interviewers of an unspecified degree of skill 
making use of test scores, personal history data, and an impres- 
Sion gained from the interview, with the predictive efficiency of 
@ regression equation involving two objective tests (electrical 
knowledge and arithmetic reasoning) . No cross-validation groups 
Were studied, but the large N makes it unlikely that the coeffi- 
cients would show great shrinkage in moving to new groups. The 
Criterion was grades achieved in an electrical school to which the 
men were assigned. The results were slightly favorable to the 
actuarial method of prediction, although whether the difference 
Would shrink to zero on cross-validation groups cannot be de- 
cided from the data presented. 

Burgess (17) studied the outcome of 1000 cases of parole from 
three Tlinois state prisons. Using 21 objective factors such as 
Nature of the crime, length of sentence, nationality of father, 
County of indictment, size of community, type of residence, and 
chronological age, and combining them in unweighted fashion 

Y simply counting the number of factors operating for or against 
@4 Successful outcome of the parole, he achieved certain percent- 
ges of success in postdiction which can be compared with the 
Percentages of two different prison psychiatrists. Again we find 


95 


Clinical versus Statistical Prediction 


that both of these clinicians employed a “doubtful” category, but 
Burgess’ presentation makes it impossible to say how many CISEL 
were so classed. When he predicts success, each of the psychia- 
trists is slightly better than the statistical method (85 per cent 
and 80 per cent versus 76 per cent hits) . When predicting failure, 
each psychiatrist is quite clearly inferior to the statistician (30 
per cent and 51 per cent versus 69 per cent). Since these per- 
centages are based upon a reference class of all cases for the 
statistician but upon a smaller reference class which excludes 
some (unknown) fraction of the “doubtfuls” for the two psy- 
chiatrists, it seems quite safe to favor the statistical method. 

Dunham and Meltzer (37), predicting length of hospitalization 
of schizophrenic and manic-depressive patients, employed 8 
Weighted combination of three predictive variables (marital 
status, duration of psychosis, and a rating on amount of insight) . 
In one Cross-validating sample (N = 217) there was no signifi 
cant difference between the Success frequency of the two methods 
even when the clinicians left about one fourth of the cases Te 
predicted, and a 10 per cent difference in favor of the actuarial 
method if the total number of cases is taken as a base for both L 
methods. In another cross-validating sample (N = 288) there 
Was 2 10 per cent difference in favor of the clinicians, but here 
their success frequency is calculated with a base of fewer than 
half the cases whereas the actuarial prediction is for all. If both 
Percentages are computed on the same base there is about a 30 
Per cent difference in favor of the actuari. 
The data are not presented so 
culation of the actuarial Succ 
left unpredicted. 


al (my calculations) 
as to make possible a separate 8 
ess-frequency with doubtful case 


Lepley and Hadley made an investigation which is not general- 
ly available, and since my only familiarity with it is through 
Several correspondents and Super’s summary of it, I shall quote 
the latter in full: 

The Surgeon’s Classifi 
a more comprehensiv 


cation Board provided an opportunity or 
sidered for flying trai 


€ clinical evaluation of cadets being con 


3 ঠি + a, 1, EFAS 
ning during several months in which it Ww. 


96 


Empirical Comparisons of Predictions 


experimented with during World War II (described in a military 
report by W. M. Lepley and H. D. Hadley). The board consisted 
of a flight surgeon and an aviation psychologist, who interviewed 
each cadet with stanines below the required levels for all three 
alr crew assignments (at that time 8 for pilot and bombardier, 5 
for navigator) . The interviews lasted approximately eight minutes 
each, ranging in length from five to twenty minutes. A total of 
1,524 cadets were interviewed during the six months of the 
board’s existence at this one classification center, and 285 were 
Sent to pilot training because the board’s review of the test scores 
and interview data led it to believe that the cadet would make a 
good pilot. Follow-up data were obtained for 259 of these cadets, 
Who were test-matched with 146 cadets sent to training at a 
Somewhat earlier date when standards were lower and without 
having been passed on by a board. Various analyses were made 
Y class and time of training; in the most legitimate comparison, 
68.9 per cent of the cases passed by the board failed in training, 
Whereas 73 per cent of those with similar stanines who went auto- 
matically to training failed. The critical ratio was 0.50, showing 
that cadets who were clinically evaluated by a board of experts 
Were no more likely to succeed than others who had the same 
Stanine or psychometric index but were not clinically evaluated. 
espite certain defects in the design of this real-life experiment 
(e.g., elimination rates were not quite the same when the two 
groups were in training, being slightly lower for and, therefore, 
favoring the board cases), Lepley and Hadley seem to have defi- 
nitely put the burden of proof upon those who claim that the’ 
clinical method is superior to a comprehensive battery of ob- 
Jectively validated and summated tests. (96, pp. 545-46.) 

The data most frequently mentioned by the actuarially minded 
Are those gathered by the Army Air Force psychologists and re- 
Ported in one of the AAF Research Reports (47). I am afraid 
that this research, while of the greatest intrinsic importance 
and interest, is largely irrelevant to the issue and for the same 
Teason that Polansky’s work is irrelevant when. cited on the other 
side. The specific subproject commonly quoted is the one entitled 
‘Clinical Type Procedures,” treated in Chapter 24 of the report. 

ut most of the negative findings deal with the low validity of 
Particular instruments, e.g., Rorschach, and this low validity 
Shows up when the component variables are treated statistically 


ar 


Clinical versus Statistical Prediction 


for predictive purposes just as when the treatment is “global” or 
“clinical” (cf. p. 652, Table 24.5, and other tables in the same 
section) . It is true that, as Dr. Super writes me, “in one sense, 
then, the AAF did ascertain the relative validity of clinical judg- 
ment: when clinicians used their favored techniques in their 
favored way, they did not do as good a job as the statisticians did 
when they used their favored techniques (objective tests) in their 
favored way (validated and weighted) .” However, the compari- 
son did involve a mixing of the question of data and that of 
method of combining, and hence is not strictly relevant. The sub- 
project CE 707A, “Conference for the Interpretation of Test 
Scores and Occupational Background” (pp. 652-56 of the re- 
port) is misleadingly titled, since study of the original mimeo- 
graphed report (81) and personal communications from some of 
the research team make it clear that the interviewers did not 
have the test scores available, so that the predictions were being 
made from other data than those which were utilized actuarially. 

On the other hand, this report does contain data relevant to 
our main problem, although these are not the data usually re- 
ferred to. Neither Lepley, predicting clinically from the scores 
on his Personal Audit, nor Humm doing the same from the 
Humm-Wadsworth profile, were able to improve upon the validi- 
ty of a straight actuarial (regression) technique. Humm actually 
did worse, since his ratings had no validity while two of his test 
scores showed significant (although very low) correlations with 
the criterion (pp. 583-88 of the published report). It was also 


shown that the clinical use of the Rorschach had the same lack 
of validity for this criterion as did a regression combination of 
Rorschach variables. A simil 


ar result obtained for a small num- 

ber of traits scored on the TAT. The probative significance of all 
these failures on the part of the clinical method is greatly reduced 
by the low correlations shown between the basic variables them- 
selves and the particular criterion involved in these investigations. 
The reports of Kelly and Fiske (62, 63, 64) on the clinical Psy" 
chology trainee #Ssessment study are sometimes cited as showing 
the superiority or at least equality of statistical to clinical proce- 


98 


Empirical Comparisons of Predictions 


dures. In the summary chapter of their most complete publica- 
tion, these authors make the following remark: “At this point, 
readers are reminded of the overall findings of the project with 
respect to the relative accuracy of statistical and clinical predic- 
tions of future behavior; in this situation both approaches worked 
equally well [italics theirs].” (64, p. 199.) 

I have made no effort to review this monumental investigation, 
careful study of which is obligatory on all clinicians. Such study 
makes it clear that in the quotation above the authors are not 
using the terms “statistical” and “clinical” in precisely the way 
I have Proposed, so that their summary statement must be taken 
to refer to some mixture of the two dichotomies, kind of informa- 
tion and mode of combination. For this reason, I hold that those 
Who cite the Kelly-Fiske study on the actuarial side are making 
the same mistake Allport makes in citing the Polansky study on 
the other side. Some of the most arresting findings of Kelly and 
Fiske, such as the insignificant contribution of either a “prelimi- 
nary” or “intensive” (two-hour) interview to the validity coeffi- 
cient, are really tangential to the problem as we have posed it. 
Such data are valuable in their sobering effect on clinical en- 
thusiasm, and thus indirectly affect one’s orientation to the whole 
Controversy. But the predictions and ratings before the interview 
Were already truly clinical in our sense, i.e., judgmental, noncleri- 
Cal inferences from the documentary information. The same is 
true of certain other fascinating findings in the Kelly-Fiske in- 
Vestigation, such as the failure of pooled, post-conference judg- 
ments by skilled clinicians to improve over the original predic- 
tions based on the same data. On the other hand, the fact that 
these pooled judgments, based upon an integrative conference, 
Were no more valid than an arithmetical combination of pre- 
Conference ratings does bear at least obliquely upon our central 
question (64, p. 177). Taking the original ratings as the raw data, 
We may ask what is the best way to use them? The evidence 
Suggests that a clerk can pool them as well as a skilled staff. But 
even this comparison is not quite in accord with our paradigm, 
Since the staff conference presumably involves a re-examination 


99 


Clinical versus Statistical Prediction 


of the individual judges’ ratings in the context of a group discus- 


sion of the data the judges had used separately. 

There are, however, some findings in the Kelly-Fiske report 
which can be made to bear directly upon our problem. Because 
of the lack of a suitable cross-validating group, the authors did 
not report multiple correlations (64, p. 157). The correlations of 
their criteria with individual tests (e.g., Miller Analogies, psy- 
chology key of the Strong, certain of Gough’s Multiphasic keys) 
rather definitely suggest a multiple R, at least as good as the 
Judgment made by assessing clinicians (pre-interview!) {from 
these same test and documentary data. Thus, the median validity 
coefficient for the prediction of a clinician based upon objective 
test scores plus a credential file (blueprint, letters of recommen- 
dation, Civil Service Form 57) was only .29, ranging from .04 to 
51 for the various criteria (64, P. 168). We may surely assume 
that these would shrink if the nonpsychometric data in the cre- 
dentials file had been excluded from the information presented to 
the clinician for judgment. Inspection of the tables on pages 


158-59 of the report indicates it very likely that a suitable com- 
bination of the best test vi 


hardly fail to do better th 
tion of the Kelly-Fiske s 
suitable reservations, a 
side. 


ariables in a regression system could 
an this. Consequently, this limited PO 
tudy can presumably be included, with 
S evidence leaning toward the actuarial 


A study by Dunlap and Wantm 
cited in discussions 
(cf. 35, p.. 57). In m 
since it again confo 
that of mode of co. 


an (38) has sometimes Bed 
as adverse to clinical methods of prediction 
J View this study is only indirectly relevant, 
unds the question of kind of information with 
mbination. Using several criteria of PUES 
cess (pass-fail, ground school grades, time in learning, flight 1m" 
structor ratings and check lists, and camera records of instrument 
readings during flight), Dunlap and Wantman made a compan” 
son between the predictive efficiency of an objective, paper-pend! 
Lest battery and judgments on Several sorts of variables MAG 
by Interviewers, including an over-all rating on “fitness for flight 
training.” The interviewers Worked as a three-man team of PSY" 


100 


Empirical Comparisons of Predictions 


chologist, personnel man, and aviator. A semistandardized inter- 
View was used and a check list and rating form was available to 
aid the interviewers in making and recording their judgments. 
The interviews lasted for 25 minutes. The actual interviews were 
Preceded by a training period which included a critical discussion 
of two recorded practice interviews with each board. It was 
shown that the reliabilities of mean ratings were fairly good 
(Spearman-Brown estimates .81 to .87 for the nine rated varia- 
bles). The study was done using interview boards at four dif- 
ferent universities and considerable variation among teams ap- 
Pearcd. The total sample consisted of 208 pilot trainees. Most of 
the validity coefficients (10 rated variables and 9 criteria) are 
either insignificant or too low to be practically useful. 

Some of the criteria were probably too unreliable to be pre- 
dictable, although others (e.g., Ohio State Flight Inventory and 
Pennsylvania Camera Criterion) were very satisfactory, and even 
the less trustworthy were shown to be predictable to some extent, 
both by certain of the interview ratings and by objective tests. 
The report indicates that the interview boards did not have ac- 
cess to the two most powerful psychometric predictors, which 
Were scores on the Biographical Inventory and the Mechanical 
Comprehension tests. That is, the clinical predictions were not 
based on the same information as the statistical. The third paper- 
Pencil test, a Personal History Inventory covering several areas 
Of relatively objective facts of the subject’s life history, was 
available to each interviewer and he was supposed to study the 
responses prior to the interview as a partial basis for guiding the 
questioning. Each interviewer was to make his own “subjective 
Scoring stencil” as an aid in interpreting the written responses to 
this questionnaire. Strictly speaking, the most meaningful com- 
Parison for our present purposes is between the interviewers’ 
Over-all prediction (based on the subjects’ responses to the per- 
Sonal history questionnaire as followed up in the interview ques- 
tioning) and the prediction yielded by a strictly mechanical scor- 
‘ng of the questionnaire to yield a single score. This latter scoring 
Was an empirical scoring based on item analysis of the records 


101 


Clinical versus Statistical Prediction 


of 1427 previous trainees against success in primary flight EE 
ing. Examination of the tables with this comparison in mind is 
difficult because the criteria available varied among the four 
schools and the sample sizes vary even within the tables for & 
single school. Rough estimates (mine) from the maximum N's 
indicated suggest that there is no significant difference between 
the validity of the P-H score and the interviewers’ estimates, al- 
though among significant correlations the preponderance favors 
the interviewer. In one sample there is a difference of .40 in favor 
of the latter (from .15 to 55, Table XIV on p. 24 of the report) 
Which would be at about the 5 per cent level. Hl 
The authors do not concern themselves with this comparison, 
but stress the fact that the inclusion of the interviewers’ ratings 
in the multiple regression system does not materially improve the 
multiple correlation over what is yielded by the objective tests. 
But as pointed out above, of the three tests considered only one 
Was available to the clinicians When making their judgments. It 
is presumably worth noting that the interview was not justified 
2s a procedure when its time and cost are considered in relation 
to the negligible increment it gives to predictive efficiency, even 
though that is not the most relevant comparison to make for Gur 
Purposes. None of the multiple coefficients were cross-validative. 


To the extent that the study is germane to our topic, it seems to 
indicate approxim 


methods of combi 

Bobbitt and Ni 
an unweighted b 
tests against a c 
gram for cadets 
criterion w.: 


ning the same data for predictive purposes. f 
ewman (14) studied the predictive efficiency © 
attery of aptitude, achievement, and personality 
riterion of success or failure in the training Pro" 
in the United States Coast Guard Academy. The 
8S Uncontaminated by the test data. Two short (ten- 
to twenty-minute) interviews were also held by two independent 
interviewers (a Psychologist and a Psychiatrist) . The interviewer 


had all the test data at hand and attempted to combine the 
scores with his interview impressio 


numerical rating. The interview rati 


the standard scores Were summed to yield an interview Score 


102 


ate equality ‘between clinical and actuarial 


Empirical Comparisons of Predictions 

Although the data were not analyzed with the present compari- 
Son in mind, detailed inspectional study of the cumulative per- 
centages suggests that the interviewer's final clinical judgment 
tends to run from 2 to 7 per cent superior to the test battery at 
most levels. Significance tests are not given, and it is impossible 
to know whether this slight advantage would be lost had the test 
battery been weighted optimally. Rough estimates of the stand- 
ard errors suggest that whether the improvement is significant 
Would depend on the success-failure split, since with a split yield- 
ing minimum standard error the percentage difference would ap- 
parently be of borderline statistical stability. These authors also 
studied the efficiency of a score obtained by an (unweighted) ad- 
dition of the interview and test results, a procedure which added 
another 2 to 3 per cent to the hits. It is worth mentioning that 
this study, which seems to give a slight edge to the clinician, 
utilized apparently very skilled interviewers whose judgments 
Were of extremely high reliability (78). Davis (85, p. 57) cites 
the study as showing “no improvement” yielded by the clinical 
Procedure, presumably because of the small size of the increment. 
The authors emphasize the homogeneity of the group in respect 
to some of the tested capacity variables. In subsequent work by 
these researchers it appears that the interview has been elimi- 
nated from the selection procedures of the academy, so apparent- 
ly its contribution was not considered to be sufficient to justify 
the additional effort (77, Pp. 249). 

Borden (15) studied the prediction of parole violation in 261 
ex-reformatory inmates. He began with 28 factors, about 22 of 
Which were relatively objective indicators obtainable from his- 
tory material such as legal documents. Pearson correlations were 
computed (uncorrected for the extreme coarseness of grouping) 
Against a five-step objective criterion of parole success based on 
Status one year after release on parole. All the relationships were 
Very low, the highest being only .20 (number of previous com- 
Mitments) . “Psychologist’s prognosis,” the clinical prediction, 
Correlated .16 with the criterion, as did the diagnosis of intel- 
lectua] level (four steps). The multiple R. on all three of these 


103 


Clinical versus Statistical Prediction 


predictors was .41, not cross-validated. The tabular data and the 
partial betas indicate that the optimal combination of intellectual 
level and previous commitments would be more efficient than the 
clinical prediction by the psychologist. I have examined Borden’s 
raw data in another way, reducing the criterion to a (more mean- 
ingful) dichotomy and locating an optimal cut for cach of his 
three most powerful predictors. The number of previous com- 
mitments gives 62 per cent hits on the sample, as compared with 
58 per cent for intellect and 58 per cent for the psychologist’s 
Prognosis. It should be noted that the psychologist did not pre- 
dict for 7 per cent of the Cases, a fact not pointed out by Borden. 
Since the other two variables correlated only —.10, it seems quite 
safe to conclude that in combination they would be superior to 
the clinical estimate. All these comparisons lose much of their 
meaning when it is seen that a blind guess of success on parole 
will succeed 58 per cent of the time, this being the base frequency 
for the entire sample. All things considered, this study can pre- 
sumably be tallied on the actuarial side. 

. Hamlin (48) studied 501 consecutively admitted reformatory 
En es Using a composite criterion of adjustment within the 
Institution, which included such items as number of times in 
Euardhouse, shop demerits, shop and school grades, and disci- 
pline marks. The prediction problem Was to estimate this comz 
Posite criterion over a four- to ten-month period following ad- 
» Chiefly objective or semi-objective 
rt of the history on all cases, Were 
TO-order correlations with the et! 
Was chosen to yield an adjustment 
Core included several clinical esti 
dated, it is not particularly relevant 
HG » the author also presents correlation: 
ms with the criterion, these contin 
from .25 to 35. Significance tests © 
ation of the Coefficients are not given. TD 
{ to us here, “Prognosis for institutional ad” 
Sts estimate,” ranks thirteenth in efficiency 


104 


Empirical Comparisons of Predictions 


among the twenty (C= .28). A similar clinical item, although 
not aimed directly at the criterion studied, “Prognosis for future 
behavior, psychiatrist's estimate,” ranks second. In fairness to 
clinical Judgment, it should be added that the most powerful 
Predictor was “Original assignment in reformatory,” which is 
Presumably based on some kind of human judgment by an ad- 
ministrator, but the author does not explain it. The criterion cor- 
relation of a prediction score based on a linear combination of 
fifteen nonoverlapping items (not optimally weighted) was .55. 
This excludes the psychiatric prediction of institutional adjust- 
ment, but it includes the psychiatrist’s prediction of a different 
criterion (future behavior) ! Although the author does not com- 
Pute a multiple R. based on the eleven or twelve purely objective 
factors alone, inspection of the table, together with the above- 
cited .55 figure for all fifteen, surely justifies us in saying that the 
actuarial prediction would be at least as efficient as any of the 
clinical or administrative estimates, and very probably more 
efficient. 

Bloom and Brundage (13, p. 251) report on the validity of 
interviewers’ quality classification ratings against a criterion of 
Success in training. The study involved a sample of 37,862 naval 
enlisted men who were subsequently sent to naval training schools 
for one of nine types of specialized training. The interviewers had 
test scores before them during the interview, but the correlation 

etween interviewer evaluations and success in training was ac- 
tually lower than that yielded by the same test scores used alone. 

Melton (78) studied the efficiency of fourteen counselors in 
forecasting the honor-point ratios earned by 543 entering arts 
College freshmen in their first year’s work. The actuarial predic- 
lion was based upon a two-variable regression equation (ACE 
And high school rank) with betas derived from a previous sam- 
ple. ‘The counselors made their predictions immediately after an 
'itervjey of 45 minutes to one hour duration, and had available 
the two regression variables plus scores on the Cooperative Eng- 
lish Test, the Mooney Problem Check List, and a four-page per- 
Sonal inventory form. The counselors were graduate students in 


105 


Clinical versus Statistical Prediction 


psychology or educational psychology in their second to final y th 
of graduate study. He found that the mean absolute error of the 
actuarial prediction was significantly less than that of the coun- 
selors; the counselors overestimated honor-point ratio; there were 
significant differences among the counselors in their average 
error; eleven counselors were less accurate than the regression 
equation, while three were more accurate, but not significantly; 
When a counselor predicts knowing the actuarial prediction, his 


result tends to be less accurate than the a 


ctuarial prediction it- 
self, i.e., 


the addition of clinical Judgment reduces predictive 
power (borderline significance) ; and, finally, if counselors who 
are poor predictors are allowed to use the actuarial table in mak- 
ing predictions, they then predict as w 

Barron (8),in a carefully executed 
therapeutic ou tcome, w 


ell as the good predictors. 

study of test correlates of 
as able to compare the efficiency of clini- 
cal and mechanical sortings of MMPI profiles. Thirty-three adult 
Psychoneurotics received intensive outpatient psychotherapy 
(one hour Weekly for six months) and were judged as to improve 
ment by two independent experts other than the therapists. These 
criterion Judgments had a reliability of .91 and on a two-category 
sorting yielded disagreement on only 2 of the 38 patients. Judg- 


ments were uncontaminated by any knowledge of MMPI pro- 


files. Light clinicians skilled in MMPI interpretation were asked 
to predict this outcome criterion 


knowing only the patients’ 9) 
sex, and MMPI curve. Total pooled hits (N = 264 = 8X 8 


রী ed 
€ three best clinical sorters averag 


2 priori mechanical systems to A 
ed hit frequencies of 78 per cent tr 
nt” (p. 239). Thus, the over-all e 5 
clency for the Skilled clinicians is between 11 and 18 per cent ee 
n of the same Psychometric data, a 
Superior to the over-all clinical 2 e 
ence. Even allowing for the oy 
1s, we find the top three still sligh 


106 


Empirical Comparisons of Predictions 


behind the weakest mechanical rule, although not significantly so. 
Similar results were obtained with the Rorschach, but the signifi- 
cance of this comparison is reduced by the fact that neither a 
mechanical use of the usually mentioned signs nor a subjective 
Sorting by four Rorschach experts (with access to the protocols) 
showed any correlation with therapeutic outcome. Barron’s find- 
Ings should probably be classed as slightly in favor of the statis- 
tical method, but since the differences are of borderline signifi- 
cance and there is variation over eight clinicians and three 
mechanical systems, I shall lean over backwards to call it a draw. 

Blenkner (12) studied predictive factors in family casework, 
and she reports some incidental data which are highly relevant 
to our problem. Two skilled judges evaluated the movement of 
Casework clients by reading the entire case records from initial 
Contact to closing. These movement judgments had a reliability 
Of 86. Three other judges (p. 78) “who had had considerable 
Xperience in casework, supervision, and/or teaching and who 
Were not members of the agency staff” (p. 67), after studying the 
Initial interview data only, filled out a ten-page schedule which 

ad been pre-tested on similar material and in the use of which 
they had been trained. Five factors from this initial-contact 
schedule Were found to be significantly associated with movement 
MM a criterion sample of 63 cases, each of whom had experienced 
at least two interviews (median approximately five). These five 
factors were (1) referral source, (2) problem area, (3) insight, 
(4 resistance, and (5) degree to which client was overwhelmed. 
Xt is evident that the rating on these five variables as exhibited 
i the records of the initial contact already involves some con- 
Siderabje degree of clinical judgment by the skilled case reader. 

teating each factor as either favorable or adverse, and arbi- 
trarily assigning a score of 2 to each favorable factor, a prediction 
Score defined as the product of the five predictor values was set 
!চ (6, ranging from 0 to 32 by powers of 2). Such a formula 
Can hardly be considered a best fitting statistical function, for 
Ohvious reasons; for example, a client with four favorable factors 
nd one aqyerse gets the same prognosis score as a client with all 


107 


Clinical versus Statistical Prediction 


five adverse! So the study is not a fair test of the SBOE 
method. However, this very crude prediction index showe be 
point-biserial of .62 with (dichotomized) movement ratings E 
the derivation sample, which shrank to .52 on the Cro 
sample (N = 47). The same skilled judges were also asked 2 
make a dichotomous clinical prediction of movement on the bE 
of their reading of the same initial interview data; these pref 
tions, as made by each of the three judges, had no validity, Fg 
the judges did not agree with one another. Apparently cy 
skilled case readers can rate relatively more specific but still i 
ly complex factors reliably enough so that an inefficient mat ৰ 
matical formula combining them can predict the criterion; 
as the same judges cannot combine the same data “im pressionis 
cally” to yield results above chance. of 
Hovey and Stauffacher (59) compared what they characterize 
as “intuitive” and “objective” prediction from a test. As in Ba 
ron’s study, the data available to the clinical judge were identict 
with those used mechanically—namely, the MMPI profile, alone. 
On the basis of previous empirical research on a student nu od 
population (N = 97 plus cross-validation on N = 40) 2 cot 
tion of 85 Personality traits (as Judged by nursing Super TE] 
Was known to be significantly associated with the various M sO 
scales (considered singly). The task set was to predict STUDY 
ratings on a third sample of 47 student nurses by utilizing kr 
MMPI profile in two ways. In the “objective” (mechanical, Tt 
tuarial) approach, a trait from the list would be attributed to 
subject if the Subject showed a devi 
on an MMPI scale w 
the original study. 
Or low, 


ation (amount not JIC in 
hich had been associated with that trait 
From four to six MMPI scores, deviating 
Were utilized in each case, depending upon the nD 
tude of deviations. Special rules were set up, more or less j 
trarily, to deal with instances of Scales “in opposition —ebe 
three high scores argued for the same trait but one 0 is $ 
argued against it, the trait would be automatically St OTe 
whenever the Peak score exceeded all others by five Ti 
points, the traits Correlated with it were attributed; and 50 


108 


= 


Empirical Comparisons of Predictions 


In this manner, “present” or “absent” predictions were made on 
from 9 to 24 of the 35 traits for each of the subjects. Using the 
other, “intuitive” approach, an experienced clinician examined 
the profile and decided for each of the mechanically made trait 
predictions whether it would be rated “present” or “absent” by 
the criterion supervisor. That is, the skilled judge knew which 
traits (among the available pool of 35) had been mechanically 
Predicted for each profile, although he did not know in which 
direction. He was required to force a judgment on these subsets 
only, as predetermined by the mechanical method for each case. 
A total of 663 single-trait predictions were made by each method. 
An uncontaminated rating by three supervisors, each having ob- 
Served each student nurse for a month, was the criterion, attribut- 
LEE a trait (or its absence) if two of the three supervisors checked 
it and the third did not check the opposite as being true. Only 
828 of the 663 test predictions “could be compared” with plus 
and minus evaluations by the supervisors. The mechanical 
method yielded a hit-miss ratio of 1.7:1, as compared with 2.8:1 
for the clinical method (P < 01). Fp 

Since this is the sole study found in which the mechanical 
method was significantly inferior, it deserves careful study by 
Way of interpretation. The most obvious caveat arises in connec- 
tion with the rules used in making the mechanical predictions. 

ithout invoking some ideal mathematical function which would, 
by definition, yield the optimal trait attributions for every MMPI 
Pattern, one may fairly ask what guarantee there is that the 
mechanical method used was a fairly reasonable approximation 
to the best fit of even a rather simple type of prediction equa- 
tion? There seems to be no reason for assuming this. Furthermore, 
all sophisticated MMPI workers operate with profile patterns, 
As the clinicians in this study quite consciously did (personal 
Communication; see also Hovey’s note 3 on Pp. 144 of the original 
study, 58) . One has no way of knowing to what extent even such 
Crude patterning methods as the Hathaway code (51) are ap- 
Proximated by the mechanical rules used. So what we have is 
Actually a comparison of the predictive efficiency of skilled MMPI 


109 


Clinical versus Statistical Prediction 


readers with that of a linear, nonconfigural function of nom 
optimal weights. I would argue that an empirical determination 
of weights is a legitimate part of the very definition of the ac- 
tuarial method. In the case of multivariable devices, such as the 
MMPI, where profile form is known (or even thought) to be 
highly relevant, I think it fair to go farther and to say that if 
only a linear function is tried, and no test is made for significant 
interaction effects, the statistical method has still not been given 
its innings, even though the optimal linear weights had been 
fitted. 

On the other hand, one must beware of any temptation ye 
settle the issue verbalistically in favor of the statisticians. It is 
tautologous that the “best rule” will excel any less-than-best 
rule, but this nonspecific truism of decision theory does not help 
us in formulating the “best rule.” As Dr. Hovey points out (in 8 
Personal communication), «, .. using peak and valley scores Im 
individual profiles would be superior to using high and low 
Scores. But even with 137 cases + . there would have been t00 
few cases of various combinations to make it worthwhile.” If 2 
many-variable prediction system is highly “configurated” (8 


next chapter) then determination of the function form, and, ¥ 


fortiori, estimation of the constants, require a very large N- 0 
obvious Teasons, shrinkage on cross-validation due to excessIve 
capitalization on sampling errors tends to be greater as the pre 
diction function becomes more complex (e.g., involving higher 
POWers, cross-products, Numerous constants). Certain patterns 
may not appear at all in a limited statistical experience, or ld 
infrequently to permit “Statistical discovery.” tf 
For example, the MMPI Auas (52) reports only 1 per cent © 
male Psychiatric patients as exhibiting a profile with a 68’ code. 
This weak char elve-variable pattern extends 
only to two of the nine clinical scales. Suppose I wish to invest” 
gate whether the trait “acting out” in Sieh. cases is differentia 
expected depending On whether some other score, say Hy, is Bit 
or low. In order to locate enough cases to have ven 10 high @ 
10 low Hy profiles mong those with the required Pa, Sc Pe® 


acterization of a tw 


110 


Empirical Comparisons of Predictions 


10, the actuary would need a patient pool of approximately 
2000! The obvious retort is, “But where does the clinician get his 
experience of this particular pattern? Isn't he subject to the same 
Sampling problem, plus the errors of human recall and weight- 
Ing?” To the extent that the clinician is doing nothing but gen- 
eralizing statistical experience, I think this objection is unanswer- 
able. The answer, if any, has to be that the clinician in some cases 
Synthesizes his personality description without specific experience 
with the particular pattern, utilizing his vague theoretical causal 
model as a means of extrapolating to regions of the profile space 
not hitherto sampled. His theory, poor as it is, as to the psycho- 
dynamics of the Hy scale may lead him to conclude that Hy ele- 
Vatlions argue against acting-out in paranoid-schizoid individuals. 
Itis surely possible for clinicians to think this way in the absence 
of direct statistical experience with 68’8 profiles. How successful 
Such causal-theory-mediated extrapolations are likely to be is an 
empirical question. The main point here is this: “The best mathe- 
Matica]l function will excel any other rule for combining the same 
data” is a tautology; but it is not a tautology that “The best 
Mathematical function which can be appropriately fitted on the 
P2Sis of a medium-sized statistical experience will excel the 
Judgment-mediated decisions of a clinician who utilizes a causal- 
Yhamic theory respecting the same scores and traits and who 
‘ds the same limited statistical experience.” This second proposi- 
tion may op may not appeal to one’s prejudices, and we have 
“Xamined it at length in the preceding sections. But it is obvious- 
Y not, like the first one, a purely mathematical or “logical” truth. 
further statistical difficulty in assessing this study concerns 
MvVerse Probability problem. From the fact that the trait 
a "is significantly correlated with the Pt scale (58, p. 143) it 
ey not, of course, follow that when Pt is somewhat elevated, or 
d the Peak score, “shy” should be attributed. Whether or not 
io S Wise policy depends not only upon the shyness-Pt correla- 
st Ut also upon the base rate of “shy” in the population under 
y.Tt 10 per cent of nurses-in-general are described as shy and 


A Ss a . 
Slgnificant]y greater fraction, say 40 per cent, of nurses with 


111 


Clinical versus Statistical Prediction 


Pt > 70 are so described, the best bet for either a high or low Pt 
case is still “not shy.” Since the mechanical procedure attributed 
traits on the basis of four to six of the MMPI scales being classed 
as deviate (high or low), and anywhere from 9 to 24 of the traits 
were thereby attributed to each nurse, it seems very likely that 
traits were being attributed in individual cases which should not 
have been from purely actuarial considerations. If the clinician 
Was being more sensible in this respect, he had an advantage, be- 
cause the actuarial method was not being applied according to its 
Own recognized rules. Finally, it is not clear what was the effect 
of presenting to the clinician only those traits which the mechani 
Cal rule scored for the particular case. In a way, this is letting the 
clinician correct the actuary, after first screening out certain 
potential clinical errors. Either the clinician accepts the trait, OF 
he reverses the mechanical prediction. Presumably the reversals 
Are very often cases where configural thinking enters the picture 
otherwise he lets it alone. I think these several considerations 
show that this study must be interpreted with extreme caution, 
and that it indicates at most the superiority of a skilled MMPI 
reader to an undoubtedly nonoptimal linear function. Out of td 
kindness of my heart, and to Prevent the scoreboard from abso” 


lute asymmetry, I shall score this study for the clinician. g 
A final relevant study I have from personal correspondence 
with Henry Cha 


ডি uncey, president of the Educational Testing SOV 
ice. In 1936 he undertook a comparison of predictive methods, 
the criterion being college grades (end of freshman year) and the 
subjects being a random sample of 100 Harvard entering 
men. Statistical predictions Were made on the basis of high schoo 
rank and College Board Examination scores. These were gen 


predictions, i The 
78, le, made at the start of the freshman year an 


fresh 9 


Empirical Comparisons of Predictions 
Validity correlations were in the .60’s, the statistical validity 
ranking second. The difference between the statistical coefficient 
and that of the “best” clinician would not be significant with an 
N of 100. 

Schneider, Lagrone, Glueck, and Glueck (91) studied the 
utility of the Glueck prediction tables in the military situation. 
The Subjects of the investigation were 200 army general prisoners 
Who had been delinquent in civilian life prior to their entrance 
‘nto the army, and who were confined at a rehabilitation center 
at the time of study for having committed offenses while in the 
Army. The clinical predictions (which were actually postdictions) 
Were psychiatric diagnoses made at the rehabilitation center by 
my psychiatrists. These diagnoses were based upon many 
Sources of data, such as FBI and police reports, data from service 
Yecords, questionnaires filled out by employers, teachers, parents, 
relatives, former army associates, hospital reports, Red Cross 
Social histories, and interviews by the psychiatrists and psychol- 
ogists. In other words, the information available to the diagnos- 
ng Psychiatrist included all the facts which were employed in 

‘© actuarial predictions, and more. Tf it is assumed that the 
“lgnoses would have been the same had the psychiatric exami- 
nation Preceded the military arrest, and that a diagnosis of psycho- 
Pathic Personality, psychoneurosis, psychosis, or post-traumatic 
‘drome would have been considered at the induction center as 
Dredictiye of failure in a military situation, 168 of the 200 cases, 
O about Sy, per cent, would have been predicted to fail by the 
Psychiatyisps diagnosis. These assumptions are certainly false, 
E they ery in a direction which strongly favors the psychiatrist. 
is, it can hardly be supposed that the percentage of 
jr StOses after failure would be less than the percentage at the 

eof induction. The actuarial prediction was based upon a 

chanical combination of the five factors in Glueck's tables, 
a » Parental education, intelligence, age at first Te 
0) of beginning work, and industrial skill. The data neede 
ja to enter this prediction table would be easily available at 


induct; 1 A 
duction, and in the present case were in fact obtained inde. 


113 


Clinical versus Statistical Prediction 


pendently of the information collected at the rehabilitation cen- 
ter. In other words, the prediction based upon the Glueck tables 
is a prediction in a somewhat more genuine sense than that based 
upon psychiatric diagnosis. For these same 200 cases, the me- 
chanical application of the Glueck tables would have resulted in 
identification of 84.5 per cent of the group of 200 soldiers. Here, 
then, the two methods are of equal predictive efficiency, if We 
ignore the fact that the study favors the clinical method by the 
nature of the time relationships involved. No indication of the 
false positive rate is given for either method. Since general experi 
ence with statistical screening devices in such situations suggests 
that many are screened who are not considered diagnosable upor 
closer psychiatric study, it is particularly important to know the 
false positive rates for the Glueck tables versus the psychiatrist. 
Since we lack this information, the present study can only be 
classed as indeterminate, and T have discussed it for the sake of 
completeness but have not included it in the summary tally- 

In the interpretation of these studies, there are several com 
Plicating factors which must be kept in mind. In the first place, 
we know too little about the skill and qualifications of the clini- 
cians who were making the predictions. For instance, there is 19 
reason to assume that the guesses of an otherwise undescribe 
Bavarian physician will be based upon sufficient psychiatric 04 


sight so that they ought to be taken as fair samples of the 5 
come of clinical judgment. 


Secondly, some of the studies have involved the comparison রে 


clinical predictions with the predictions of regression equations in 
Which the statistical weights were determined by the data of i 
group to be predicted, not Cross-validated. Partly counterba 


ancing this is the fact that only udies 


So seven of the multivariable stud. # 
used empiric Dn 


fin al weights assigned by efficient methods; the ree 
TE CE assigned weights judgm optim 
ent non-op 

methods. Judgmentally or by other 

2 T 
ie) five of these studies evaluate predictive efficiencies 
the several dlinicians separately. The clinician is a shadowy figure 
and while it is important to know what the average clinician ca 


114 


Empirical Comparisons of Predictions 


do in competition with the actuary, it is also important, and of 
even greater theoretical interest, to know whether there are some 
clinicians who can (consistently) do better than the regression 
equation. However, it is difficult to evaluate the argument some- 
times offered that the best clinician is the appropriate representa- 
tive for comparison with the statistical technique. Actually, if 
We judge from the studies reviewed, even this standard of evalua- 
tion would probably not do much to change the box score as 
between the two methods. But is the proposal 4 sound one any- 
how? Statistical considerations make it clear that to apply such 
2 principle in the interpretation of a single cross-sectional study 
Of one set of clinical and actuarial predictions would be to intro- 
duce a serious pro-clinical bias, because the observed variation in 
hit rates is generated by whatever stable individual differences 
exist among the clinicians plus random errors, the relative con- 
tribution of these components being unknown and not accurately 
Assignable in such a design. The seriousness of this problem is 
greater as we deal with more clinicians, and of course appears 
also On the side of the mechanical methods when more than one 1s 
tried (as in the Barron study). Presumably some kind of longi- 
tudinal study is needed to find out whether and to what degree 
the “good” clinician is stably such, rather than being merely the 
Momentarily luckiest fellow among a crew of equal or near-equal 
Mediocre guessers. Even a clear proof of stable differences among 
dlinicians would still leave us with a serious practical problem. 
Suppose that one in ten clinicians, sampled randomly from 
Some national population, can do consistently better than the 
Statistical formula in a specified type of prediction problem. In 
tempting to utilize such published information administratively 
a different clinical installation, we definitely stand to lose if 
chosen subset, of our clinical 


We 
treat all, or most, or a randomly 
Unless we have an 


as if they were among that one tenth. j a 
rate method of identifying this local representative of clinical 
Senius, jt is not of any practical value to know that he exists (or, 
ৰঃ accurately, that there is a certain probability that we have 

© such) . In fact, if we have fewer than seven clinicians on our 


115 


Clinical versus Statistical Prediction 


staff, the odds are that he is not among us at all! And, of course, 
even if he is, an over-all clinic policy of acting on clinical judg- 
ment when it contradicts the actuarial prediction will not pay off 
unless the unidentified expert is so markedly superior to the 
actuary that he can counterbalance the deficit accruing from our 
concurrent reliance upon the less efficient guesses of his col- 
leagues. A little arithmetic will convince the reader that if only 
a small minority of clinicians excel the actuarial method, it would 
take an impossibly high superiority to justify a blanket shift to 
the clinical mode. For some values of the rates involved, algebraic 
constraints make it simply impossible for a few such deviates to 
be good enough (i.e., more than perfect!) to make up for the 
losses. 

The only way to get around this problem is to identify the 
better than actuarial staff in each clinic. This in turn means either 
doing almost the whole comparison study over again in every in- 
stallation, or developing highly accurate indirect methods (e.8 
personality tests) for detection of such personnel. Evidence to 
date on the generality of such traits, as well as the general drift 
of selection studies in other areas, can hardly make us optimistic 
about this approach, although it should be thoroughly explored. 
Finally, unless there is greater generality over clinical skills than 
We have any reason to expect, not only the personnel involved but 
ao the prediction problem cannot be changed without raising 
the issue of generality in predictive skill. And even if the generali- 
ty were very high in correlational terms, we need to know how the 
absolute predictive efficiencies in the new problem compare Wii 
Ree oq A actuarial method. The difficulties and complexities 
involved in the practical use of a finding hat some subset © 
FR can excel the actuary are ai atdons 
te ES Aes for 2 clinicians to assimilate the he 
EI 0a | the Preceding lines— chiefly, Iga fr 
00nd Ho bee ) p TER on the unforkun® TH 
URE A better had we followed the i such 
objectors do not eis th “ance of the actuary. But wha us 

at in order to save that case, they ™ 


116 


Empirical Comparisons of Predictions 

lay down and abide by a decision-policy which will misclassify 
some other patient by defying the statistics. Presumably it hurts 
me as a patient just as much to be misevaluated regardless of 
Whether the final mistaken judgment is made by a Ph.D. or by a 
clerk. A clinic’s departure from the optimal method merely effects 
an exchange of some cases for others—but doesn’t quite break 
even on the exchange. I do not quite know how to alleviate the 
horror some clinicians seem to experience when they envisage a 
treatable case being denied treatment because a “plind, mechani- 
cal” equation misclassifies him, except to reiterate that the only 
Way we could have prevented this happening to him would be to 
have employed a strategy which, while saving him, would sys- 
tematically have guaranteed that the same error would be made 
With respect to somebody else whom we have in fact saved. This 
other error is, of course, just as blind as one made by the blindest, 
Cut-and-dried formula, since the plain fact is that the clinician, 
With wide-open eyes (and supernumerary ear) nevertheless did 
Not see the world rightly. So it is the number of errors by the 
two methods that is all-important. In this connection see the 
excellent book by Bross (16) and the paper by Duncan et al. (36) - 

It seems to me from these considerations that our decision as 
to the economical and ethical thing to do cannot validly be in- 
fluenced by the possibility of the best clinicians being somewhat 

etter than the statistician; and that the burden of proof lies 
Upon those who advance this argument to show empirically that 
Such deviates show sufficient generality over the major predictive 
tasks and can be accurately identified by feasible methods. Until 
this is done the argument is on very shaky ground. 

These studies do not tell us much about the kind and amount 
of clinical study that is competing with the actuarial method. 0 
2 Priori grounds, one might expect that mediocre or poor clinical 


Methods would be inferior to the actuarial, since the latter 1s 


Ways as good as the sample can make it, but that superior clint- 


y s 
al methods might be better than the actuarial. ke ol ক 
he studies did not involve collection of clinicians’ judgmen!s 


of their own confidence, and it is important to know whether a 


117 


Clinical versus Statistical Prediction 


subset of the clinically made predictions can be identified as Tr 
f the associated feelings of assurance. 
Mex KUCSEES by HEARS of ্‌ 1 on the whole, 
would imply, if the clinician and actuary are equal 0 Teh! 
another subset in which the clinician actually runs worse, 
Tse, is quite possible. . V 

gt the ie being considered is the relative sleet 
actuarial and nonactuarial methods of combining the same Ah 
to yield a prediction, one might ask how we know that the | 
cian is actually making use of all the information at his disp 2 
Or at least is employing to the best of his ability that fraction 5 
the information which is being combined by regression EEE 
In the studies cited we do not know how much of the relevant i 
formation the clinician is combining. There is some RENE 
Sarbin’s study that the clinician actually makes less use of Lo 
parts of the information than he should, or, more precisely, T ন 
he weights certain parts of the information too heavily. But dies 
not believe this should be prevented in such comparative sku Ca 
since so long as the data are made available to him, the st 
should be permitted to assign the predictive weights. There is “ 
Treason to exclude artificially the special case in which he ie 
2 Zero weight to a factor and thus fails to use it in prediction. ন 
ability of the clinician to weight the information is precisely bid i 
is being studied in such comparisons, and the decision as 


whether 8B =0 is a Special case of this general problem of weight 
assignment. Consequently, all th 
to assure himself th 
able to the clinicia 
use it. 

Future studies of the 
should either re. 
Should be arra 


0 is 
at the experimenter Eo 
at any information used actuarially is GEO 
1, regardless of whether the latter sees 


nged beforehand to assign a “doubtful” LE 
S of the predictor variables oe 
1s least trustworthy, so Has দৰ 
be permitted to avoid prediction im 


. - f the 
Otherwise, meaningful comparison © 
thods is difficult. 


stated fraction of Cases. 
efficiency of the two me 


118 


Empirical Comparisons of Predictions 

In spite of the defects and ambiguities present, let me empha- 
size the brute fact that we have here, depending upon one’s 
standards for admission as relevant, from 16 to 20 studies involv- 
inga comparison of clinical and actuarial methods, in all but one 
of which the predictions made actuarially were either approzi- 
mately equal or superior to those made by a clinician. Further 
Investigation is in order to eliminate the defects mentioned, and 
to establish the classes of situations in which each method is more 
efficient. I do not feel that such a strong generalization as that 
made by Sarbin is warranted as yet. Note that in terms of the 
kind of thing being predicted, there is not much heterogeneity. 
Essentially three sorts of things are being predicted in all but one 
of these studies, namely: (1) success in some kind of training or 
schooling; (2) recidivism; (83) recovery from a major psychosis. 
Studies of Prognosis for outpatient psychotherapy of neurotics, 
Probably the most important situation in terms of current pre- 
dictive practices, are represented only by the work of Barron (8), 
'n which the information available for clinical prediction was con- 
fined to age, sex, and the thirteen scores on the MMPI— hardly 
a typical setup as clinicians operate. Nevertheless, it is clear that 
the dogmatic, complacent assertion sometimes heard from clini- 
Clans that “naturally,” clinical prediction, being based on “real 
understanding,” is superior, is simply not justified by the facts 
collected to date. In about half of the studies, the two methods 
fre equal; in the other half, the clinician is definitely inferior. 
No definitely in terpretable, fully acceptable study puts him clear- 
ly ahead. In the theoretical section preceding we found it hard 
to show rigorously why the clinician ought to do better than the 
2ctuary; it turns out to be even harder to document the common 
claim that he in fact does! 

Perhaps I ought to be embarrassed by this latter point, having 
devoted so much time to a theoretical discussion of how the clini- 
dan’s Operations could transcend the limitations of the clerical 
Worker, Now I cite a mass of empirical studies indicating that as 
@ matter of fact they do not. I imagine that most clinicians will 
feel themselves still persuaded of something about clinical methods 


119 


Clinical versus Statistical Prediction 


by the examples given in the theoretical section, and ov 
from their own interview experience, in spite of the 0) isk 
studies. I have to admit that I share this weakness. At the bk: ত 
of seeming to defend the clinician’s special talents at any FY নাঃ 
let me suggest some differences between the situations that চি ৰ 
Vince clinicians of their powers and the situations dealt wit 
the studies I have cited. In suggesting these differences I We 
trying to escape the burden of the nineteen studies. I A 1 
should be taken very seriously and that clinicians shoul T 
humbled by them. My purpose in the following remarks 1s 
to “explain” to myself as clinician what it is that these stu iS- 
show, and to find out, if possible, how we could have been 50 
taken in our expectations as clinicians about the outcome of Be 
studies. Essentially I shall argue this: The kind of episode dunt Ko 
therapy which gives us a conviction of our own predictive Poin 
may be quite legitimate, but the transition to the straight ন 
diction problem involves features which seriously impair ক 
analogy between the two sorts of situation. In other words, i % 
if the clinician is Tight in believing that his “third ear” ae 
could not be duplicated by a clerk, this should still not lead 1 
“to expect other results than those in the studies cited. rn 
In the first Place, there is a major pragmatic difference betw' b 
the predictive demands made upon the clinician during ther 
and those made in the Purely prognostic setting. All of us Kee 
2 certain amount of blind-alley hypothesizing to occur in th- 
course of a therapeutic series. Therapists form transitory hypo যা 
eses of extreme tentativity and often may not follow them i 
by so much as a leading question unless additional support is 
sequently appears in the client’s spontaneous productions. 


: S UTS 
‘Interpretive therapy of moderate duration (say, 25 to 50 ho 


: : ion 0 
One ordinarily expects Some fraction of the total conversatio 
be devoted to the exploration of p 


110, t to 
: ossibilities that turn ou pody 
of minor Significance or (more rar ৰ 
knows what the Payoff rate is 


guesses that come to therapists; b 
cy might be considerably less th 


ely) totally irrelevant. 
for these moment-to-mO 
ut the over-all success fred 
an 50 per cent and still Ju 


120 


Empirical Comparisons of Predictions 

the guessing, for unless the therapist is clumsy or the client un- 
usually impatient, the time spent on exploration of poor guesses 
need not greatly detract from the positive contribution of suc- 
cessful ones. Presumably even the unsuccessful paths are rarely 
Pure waste, since they contribute to such diverse concurrent aims 
as further getting acquainted, general desensitization, and inci- 
dental support for quite unrelated constructions (e.g., how free 
is the client to suggest that something is getting nowhere? Is he 
too docile in following the therapist into this blind alley? What 
does he do with the flat interpretations that are likely to emerge 
in these doldrums?) . But even if the to-be-discarded hypotheses 
Were pure filler, they would not impede the therapy except as they 
consumed time. The natural tendency of therapists to “mark 
Where they hit and never where they miss” may lead them to have 
more confidence in their hypothesizing than is justified; but it is 
not clear just how they should be asked to alter their behavior if 
this is true. If the success frequency of intercurrent therapeutic 
hypotheses is .35 and a therapist thinks it is .60, knowledge of this 
discrepancy might lower his morale but it would be a complicated 
matter to say precisely how it should affect his interviewing pro- 
cedures. The point is that Reik may be justified in the kind of 
thinking shown in the abortion example even if, for any single 
Juess, the odds are against him. 
When we move over into the straight prediction situation, all 
this is radically changed. Here, no erroneous weighting is filler 
In the above harmless sense, because statistical filler is error 
Variance. Every time a clinician pays attention to a factor in 
Predicting a single case, he is betting that the factor is not filler. 
Further, if he corrects his tentative prediction by a certain amount 
ccause of this factor, he is assigning a definite weight to it in 
i Prediction equation. Error in this weighting (Hn i! ie 

“1 Will deny is practically inevitable) contributes to error 1 
Prediction, Ti, is well known that if a variable X has zero beta 4 
A true regression system, assignment of any non-zero beta ys 
Cessarily increase the mean squared error. So that in the straight 
Prediction setting, all bad ideas tend to subtract from the power 


121 


Clinical versus Statistical Prediction 


of good ones. The realization of this difference between ESET 
and therapy should help clinicians to accept the rather bea 
ble findings of prediction studies, to the extent that their A 
incredulity flows from a conviction that the third ear does Pay a 
therapeutically. It may do so, and I myself believe that 3 does. 
But this is not good counter-evidence against Sarbin’s claim that 
it does not pay off in the straight prediction situation. 

Another difference between the Reik type of example and the 
quantitative studies cited is that the latter all involve the pre- 
diction of a somewhat heterogeneous, crude, socially defined be- 
havior outcome. In Sarbin’s study, for instance, we are not Corr 
cerned to predict individual reaction-forms or even a specified 
disjunction of response classes, but rather a socially defined out- 
come, namely, satisfactory grades on the student’s record. In the 
case of academic success, and even more strikingly in the case 0 
failure, there are many alternative paths. The student’s honor- 
Point ratio is itself a statistical quantity, related only by great In- 
direction to the multifarious individual reactions and decisions 
that were made over a long period of time, and which contribute 
by small increments to the determination of his particular DE 
ber. Tt might seem that the case of parole violations constitute ly 
an exception to this Seneralization, since one usually violates y" 
Parole by being detected in a single forbidden act. But this wou S 
be a superficial analysis of the case. While it is usually a particy 
lar action which comes to the attention of the parole SEE 
everybody knows that a large number of criminal occurrence 
do not come to the attention of the law; and of those which do, 


ট ogee 
considerable number never become legally attached to a partie 


fa ও J 
lar criminal. Tt is very probable that those parolees whose Paro 
i 


S revoked because they have violated one of the rules of Be 
have been detected in one of a considerable number of crimin® 5 
forbidden acts, the rest of which went undetected. This Ee 

that parole violators, as a ETOUp, are those who over a period # 
time have responded on 2 much larger number of occasions f 

forbidden ways than nonviolators, so that the probability i 
detection is considerably raised for the former group. 


122 


Empirical Comparisons of Predictions 


With this explanation, I think we can say that the kinds of 
prediction investigated in the studies cited have the common 
Property mentioned. If we consider now the examples cited in my 
theoretical discussion of clinical activity, we see that they in- 
volve relatively specific and concrete predictions and postdictions, 
e.g., that the patient had an abortion performed, that on a cer- 
tain night in walking home the patient was having the uncon- 
scious fantasy that she would leave her husband and make him 
feel sorry, in turn entailing the prediction that she would talk 
about material of this type in the succeeding moments of the 
interview. I think that if a clinician asks himself what kind of 
evidence causes him to remain unconvinced by the statistical 
studies I have cited, he will find himself thinking of individual 
Predictions and postdictions of this concrete type. 

If I am correct in this, we might think about it in the following 
Way. Since a very large number of concrete situations, in relation 
to specific and general psychological complexes, determine jointly 
the long-time social outcome in the case of such a question as 
Surviving in school, in order to predict this outcome by clinical 
Understanding it would be necessary to formulate an extremely 
detailed conceptual model of personality structure. This model, 
each of the components of which would have to be highly con- 
firmed, would then have to be combined with an extremely de- 
tailed account of the situations which the subject would meet 
during his first academic year. These two together would then be 
Used to arrive at concrete predictions of many single episodes, or 
at most restricted classes of episodes,—for example, whether the 
Subject would attend a certain musical comedy two nights before 
his midquarter examination in differential equations. These con- 
crete predictions would then be cumulated in some complicated 
Way to arrive at the prediction of the honor-point ratio. Now it 
is Obvious that in none of the studies cited did the clinician have 
An opportunity to “formulate the personality” or to determine 
the press in anything like the detail indicated. Under these cir- 
ths the appropriate attitude would be something like 

is: 


123 


Clinical versus Statistical Prediction i 
“In order to predict the broad social outcome of EY 
college, I would have to know a very great deal 2500 eT 
vidual, which I cannot learn in a matter of an hour's ক Tn 
or even in several hours. I would also have to know bane, 
2 great deal about the kinds (and dates!) of situations kb ot 
he was going to be subjected, which would not be possi '্া 
with an army of social workers at my disposal. Therefore, Na 
abandon the effort to mediate my predictions by means of 2 Ta 
hypothesis formations concerning the personality 5 HUCE ° 
stead, I shall fall back on the well-known psychological rule ন 
the best way to predict the way a person is going to act is to a 
out how he has acted in the past. I know that there B56 18 EE 
many ways of behaving which contribute to academic Bue 5 
and their relationship to the need structure of an individual Ke 
complicated that only a Very intensive study would enable ye 
to make use of this need structure in prediction. But if I i 
more of an ‘empty organism’ attitude, I can simply ask, ‘Have t i 
complex, heterogeneous behaviors of this individual in the nf 
been on the whole such that he has achieved academically? h 
they have, I shall make the assumption (which would be i 
the great majority of people in my sample) that his behavior i 
positions, whatever they are, will remain relatively stable CES 
his first year in the arts college. Those students who have, in t i, 
diverse ways, behaved so as to get good marks in high school a ন 
good marks on my entrance examination, will usually continue 4 
behave in the same Sorts of ways. To attempt to character 


<" to 
these sorts of Ways in any detail, upon the evidence available 
me, would be fruitless.? 


IT am inclined to think that the s 
apply in the case of a more stric 
prediction of Tesponse to psych 
whether a certain event had o 
whether a circumscribed topic 


the next few interviews, I thin 
actuarially. 


ame sort of consideration পণ্য 
tly clinical domain, such as ow 
otherapy. If I wanted to kn ss 
ccurred in the patient’s Be in 
Would recur in the association 
k I would prefer to proceed he 
Tt, on the other hand, I was asked to predict Sen all, 
2 patient would respond well to Psychotherapy which, after 


124 


Empirical Comparisons of Predictions 


involves a socially defined outcome achieved by a very large 
number of individual learnings, I would trust such statistics as 
the duration of his illness, the number of previous therapeutic 
efforts made upon him, and the prognosis associated with his 
Psychiatric diagnosis, more than any clinician’s judgment based 
upon an estimation of his dynamics. 

One very striking difference between the empirical studies and 
the clinical examples lies in the form in which the prediction prob- 
lem is couched. The task facing the actuary is rather like that 
presented on a multiple-choice test, in the sense that the actuary 
(and the clinician competing with him in these studies) is initially 
presented with the available alternatives. Thus, we are told that 
the students in Sarbin’s study must either fail or pass, that the 
schizophrenics in Wittman’s study either recover or remain ill, 
and so on. The class of possibilities is indicated for us, and the 
Predictive task we face for each individual case is to assign him 
to one of these. Even in the continuous case, such as predicting 
freshman honor-point ratio, we are still informed that the variable 
being predicted is a number from —1 to 3.0, and we are aware of 
@ good deal of qualitative matter about this dimension. Now con- 
trast with this the clinical examples we have discussed. Here the 
dlinician has in a sense to create the prediction, not merely to 
Say “Yes” or “No” to certain indicated alternatives. ¢ 

I am not here talking about whether, in some philosophical 
Sense, the actual set of meaningful possibilities is finite. What I 
Wish to stress is the concrete psychology of the task as presented 
externally. Sarbin’s clinicians, as well as his statistical clerk, were 
told the nature and range of the continuum to be predicted. They 
did not have to call upon their previous experience, hoping that 
by synthesis and recombination their brains would, so to SAY; 
form and then riffle through all the possible states of affairs. By 
contrast, in, say, Reil’s postdiction of the abortion, the popula- 
oh of alternatives is.as large as that of human thought and life. 
Any restriction of the hypotheses to a narrow class, however 
Plausible, has come from the clinician. No one has listed for Reik 
a set of 30,000 latent thoughts (which could yield silence fol- 


125 


Clinical versus Statistical Prediction 


lowed by “There’s a book on its head”) one of these being that 
of an abortion. Somehow this response must be emitted by Reik 
in the presence of the ambiguous stimulus. In a very real sense 
it is the difference between taking an individual Rorschach and 
filling out a true-false inventory. Or, to take the example of the 
raven, no one has ever seriously proposed providing clinicians 
with a master list which would contain all possible fantasies re- 
garding sleeping husbands. Yet, from the standpoint of mere 
mechanics, how else, lacking such a preposterous table, can we 
enable the nonclinical clerk to think of them? Statistical weights 
enable us to assign probabilities to values of variables (including 
the disjuncts in a class of named alternative situations) . They do 
not, quite obviously, enable us to fantasy the situations or to list 
their names! 

Tf the Polansky study were repeated using actuarial methods, 
but with the prediction problem remaining as specific as it Was 
in his investigation, it would be interesting to see whether the 
advantage of the actuarial method would appear. These con- 
siderations are, of Course, entirely speculative, and it is of the 
greatest importance that suitable experimental designs should be 
worked out for the actuarial study of such moment-to-moment 
clinical Predictions as are discussed in our theoretical section. The 
possibility that the choice of more suitable prediction problems, 
in which the advantage of structural-dynamic hypotheses would 
Ve more chance to show itself, might lead to superiority of the 
ER ৰক ত a se from the practical SEE 
on of Tena 5 2s those by Sarbin and Wittman. Prt 

Eth of hospitalization, response to certain kinds of 

a and Perhaps even exacerbation of illness, resemble the 

MEN Ere mn চ এছ more than they doa শেখ” 

Would be wise for a psychologist bi eR dent an tric 
0 is asked by the psychia 


st 
ও সৰ OF Not a patient will benefit from shock treatment 
ES 'S reliance on actuarial rather than nonactuarial pro” 


Alt i 
he above comparisons have treated efficiency solely ™ 


126 


Empirical Comparisons of Predictions 


terms of predictive success. This method of evaluation, while il- 
luminating for theoretical purposes, actually gives the clinician 
& considerable advantage. For practical purposes, the concept of 
efficiency must include some reference to the amount and level 
of work required to arrive at a given degree of predictive success. 
Once some sort of statistical backlog has been collected (and this 
takes no more time than is needed for the clinician to get experi- 
ence), the actuarial method almost invariably takes less time, less 
effort, and—no minor point—can be entrusted to lower paid 
Personnel possessing much less skill. Any realistic assessment of 
the comparative efficiencies of the two methods must give very 
heavy weight to these considerations. A. concrete feeling for this 
Point may be readily gained by reading the report of Kelly and 
Fiske (63) in which the mechanical use of the appropriate Strong 
Scores obtainable by mail at low cost is more effective than seven 
man-days of skilled clinical time and a cost of $300 pe 
The several hours of highly skilled work sometime 
In arriving at a dynamic formulation of the patient by an ingeni- 
Ous extrapolation of test results could very possibly be spent 
Much better in added hours of psychotherapy. Whether the 
রঃ ient is seen in private practice or in a charity hospital, the 
Skilled clinician is being paid, and someone is footing the bill. It 
has often struck me as paradoxical to find a near-routine battery 
Ke complex, skill-demanding tests being administered in a clinical 
je where the median number of therapeutic hours per case 
ih appreciably in excess of the total skilled time expended by 
© psychologist on the case in making often dubious dynamic 

Nd prognostic inferences from the test data. A really honest 
Xamination of this sort of question contains, needless to say, 2 
Breat deal of dynamite for the profession. Sooner or later it TUR 

© done; and the socially significant meaning of the phrase “ pre- 


ge efficiency” will have to be employed rather than the 
throughout the present dis- 


J CASE. 
s expended 


coreti ছ 
Oretical meaning we have used 
cussion. 
je otigh I can present no statistics 0 
Inct impression that the amount of time expended b 


I have a 
y 4 PSY- 


n the matter, 


127 


2 
Clinical versus Statistical Prediction : 
chologist in the administration of multiple and fancy es 
devices, and in the dynamic formulation based upon an a. ut 
integration of them, progressively declines as this oR i 
economics is increasingly highlighted by the Character 0 Kk 
practice, Psychologist X gives four or five projective linha. 
Shipley, a Wechsler-Bellevue, and an MMPI, and chews Le 
results ad nauseam when he is functioning in an EN 
context. He picks up his semimonthly check in any event, so t 2? 
value of time per case, while willingly conceded as important in 
abstracto, is not strikingly called to his attention. We find the 
battery, and the time spent on interpreting it, undergoes a sus- 
Picious shrinkage when our Psychologist acts as consultant i 
Privately practicing psychiatrist with the latter collecting 1 
fee for him. Lastly, behold him in his own private therapeutic 
practice, where he himself is the evaluator of the therapeutic 
Power conferred by his armamentarium and he himself has to put 
the financial bite on his client. His enthusiasm for “advance 
knowledge through dynamic integration” has now so flagged that 
we find him slipping the client a quick Bender and sending him 
home with a group form MMPI to be filled out between sessions. 
I have a hunch that some profound and terrifying truths are 


discernible in this psychometric devolution but T shall not press 
the point. 


128 


General Remarks on Quantification 
of Clinical Material 


al prediction one often 
d by the more tender- 
ing. For instance, 


I 
Earn the problem of actuari 
minded লা certain misconceptions hel 
there is RES prevent clear thinki ins 
Persons in te e misconception that mathematical descriptions of 
Cal scores sh লী Bf STONES require that persons achieving identi- 
the traits s Bg be identical or indistinguishable vith respect to 
Vr or 0 quantified. We sometimes hear this view expressed 
numbers tatements as «A human being is more than just 2 set of 
.* It is pointed out that two persons who achieve a score 


of 1 5 si 
9 51 . . . 
£mas above the mean on an introversion test do not mani- 


est their 3 
i Cir introversion in precisely the same Way, and that they 
f experiences. The 


i 

Eh ante at it via the same sequence 0° CTT. 

is ন ENE about such a statement 1s that it is true. But 
0 না uniqueness of the single case is no more fatal 
Ogical qu Ey than it is to physics. To see, it as fatal to Psycho- 
cepts an ন ন ification is to forget that the class ! 

attell dimensions is found in all descriptive enterprises. As 
Sense SAT It seems that one must subscribe to the extreme 
Way uni BOS argument and admit that all traits are im some 
nd no ET (28, p. 61). No two individuals are exactly alike, 
Justice tb erbal or mathematical characterization can do complete 

© their individuality. No two explosions are identical nor 


129 


Clinical versus Statistical Prediction 


can any system of equations give a description of any of them 
which is exhaustive. As Thurstone has pointed out, those who 
Object to assigning the same score to two introverts because their 
introversion is distinguishable should in all consistency object to 
saying that two men have the same income since one of them 
Works and the other steals (100, p. 54). A cannon ball falling 
through the air is “more than” the equation S = ¥ g t2, but this 
has not prevented the development of a rather satisfactory 
science of mechanics. The exhaustive description of an individual 
event is not aimed for in the scientific analysis of the world nor 
can it be hoped for in any descriptive enterprise (54, 76). All 
Macroscopic events are absolutely unique. It is a further mistake 


to exaggerate the degree to which this lack of concreteness re- 
flects a special failin 
h 


set of percentile ranks, no 
components, and no para 
an contain all the richness 
stractive or summarizing 
s shared by differential equations, maps, 
So-called scientific description, however, 
hich are most relevant in terms of causal- 
aims; and, secondly, employs a language 
ambiguity, ssible, but not always!) which minimizes 


Ve Case of continuo gl o ore 
i Us ven ml 
obvious. Variables this is e 


Iti ) 
be eS Suggested that mathematical description 25° 
at equal amounts of a component must always mean the 


130 


Quantification of Clinical Material 


same thing psychologically. For instance, the difference between 
Zero M and 4 M in the Rorschach of a bright adult “means” more 
than the difference between 7 M and 11 M. What actuarial as- 
sumption denies this? We know that a change from 98.6° to 99.6° 
is more significant of pathology than one from 101° to 102°. The 
confusion present in this argument is perhaps partly the fault of 
the psychological statisticians who have confined themselves 
largely to the study of linear relations such as are used in mul- 
tiple regression; but these arbitrary restrictions are, of course, not 
a necessary consequence of the application of mathematical 
methods. Incidentally, there is a lot of loose talk around these 
days about nonlinear relations. I do not doubt that there are a 
large number of such in the behavior domain, but we ought not 
to browbeat the statisticians with this phrase until we know more 
about where these nonlinear dependencies occur and how much 
they pay off predictively over and above the much-maligned 
linear regression system! Clinicians intoxicated with the abstract 
idea of nonlinearity and interaction of variables might contribute 
to their historical perspective by reading a short paper of Thorn- 
dike’s published in 1918 (99). 

A further confusion is involved i 
mathematical description or prediction involves the assumption 
of simple additive relations among the variables and is inade- 
quate to deal with dynamic interactions. (“Additive” is a favorite 
Pejorative epithet with some clinicians.) For example, the oceur- 
rence of several M with 7P and 88 per cent F-+ on the Rorschach 
is healthy. The same number of M with only 1P and 50 per cent 
F4+- suggests a malignant break of fantasy from reality, such as 
in a paranoid schizophrenia. Zero M with IP and 50 per cent F+ 
might be suggestive of low intellect. The only remark to be made 
On this score is that mathematical analysis does not in any way 
exclude such possibilities. The mathematical treatment of non- 
additive situations is found in great profusion among the formu- 
lations of Hullian learning theory (a system which I have heard 
dlinicians confusedly stigmatize as “atomistic”; Hull's composite 
expression for reaction potential involves a product of several 


131 


n the frequent claim that 


Clinical versus Statistical Prediction 


part functions of independent variables, and even ose TY 
functions are not simple linear relations) . Probably muc ন Ti 
talk about “patterning” as something to be contraste রা 
“statistics” would not occur but for the fantastic ie TR Py 
ignorance of most clinicians. I have heard clinicians nL 
topic in such a fashion that the only possible inference el 1 vat 
could draw was that they had never heard of the interaction 
he analysis of variance! 

f a dificult to attempt a precise discussion of the problem 
Patterning, since we generally use the term with a certain To 
ness, to characterize what may be a very heterogeneous NE ক 
of types of functional dependency. However, a perusal i, ন 
clinical literature leads to the identification of one kind of b i 
tion as the commonest to which the term is applied. That 1s lb 
situation in which the indication of a Jiven variable with resp ia 
to the criterion is not constant, but the weight, and possibly Ha 
the direction (sign) of contribution of that variable, are Eon Hh 
of the values which the other predictor variables have taken 0. ] 
Hovw important such a refinement actually is remains an bee. 
Problem which I shall not consider here; but the clinician oa 
be pardoned for his irritation when a nonclinical academic PSY 


হন ৰ্‌ FUR দি . one 
chologist with some statistical interest informs him, on the 
hand, that Whatever the clinici 


and on the other hand, fails to 
for expressing such a st. 
differential weights of 


an does is essentially statistic 
present him with convenient kA 
ate of affairs. Tt is sometimes said that t ie 

the regression equation or of the disc anh 
nant function were devised specifically to take account of PACU 
relationships. Tt is Obvious that the kind of patterning which Ne 
are considering here is not adequately dealt with by such Pr 
cedures. 


Probably the most st 


AE $ the 
in which neither Variable is related to the criterion, and yet 


; iin ty 20 HUE 
SON 7 ls used as the indicator of a relationship to which it 1s 


132 


Quantification of Clinical Material 


suited. I mean unre i i i 
of independence as Fe EE the En 8 HE 
2 and y are both totally independent of a i ESL NE 
eo criterion z, is it possible 
i ed a knowledge of x and y alone? I have discussed 
e possibility of such an extreme case elsewhere (72), and Horst 
(56) has recently presented a generalized mathematical treat- 
ee In order for patterning effects to occur, it is, of course, not 
দা SAT Y that the unpatterned variables have zero validity, al- 
hough that is the most striking case for getting the point across. 
In the case of continuous predictive functions, what makes the 
System patterned is that the rate of change of the criterion esti- 
mate with respect to one of the predictor variables depends upon 
Nl or more of the other predictor variables. This is a stronger 
claim than mere nonlinearity, of which it is one, but not the 
only, form. A predictive function y = log sin 1 + 2° is not linear 
nl would be rather poorly approximated by the usual multiple 
ee methods. But neither is it patterned, because the mode 
en of y upon x, is invariant with respect to the values 
ee on by ws, and conversely. On the other hand, a predictive 
ion such as Y = 1 + 2203 1S patterned, because the effect 
an increment in x2 depends upon the value of xs. Similarly, 
Y= 2, + 20-15) is patterned, in a more complex manner. Tf the 
Values of the -variables are grouped and thus divided into dis- 
Continuous levels, what we have is simply a significant interaction 
term in the analysis of variance. 
Although it does not help us in hitt 
dictive function or in determining its parameters, I should like to 
Present an abstract definition of patterning for the continuous 
Case. I offer this mainly for the benefit of statisticians who have 
Wondered what clinicians could reasonably mean by their talk 


Of patterning, over and above (1) differential weighting and (2) 
Nonlinearity. The nature of the variables—i.e.. whether pheno- 
sent or future, outer 


typi ড় ্ 
YPic or genotypic, measured or judged, pre 
istory, etc.— is irrelevant. Nor is 


clevant. Consider a predictive 
Differentiating partially with 


ing upon a configured pre- 


ori ঢ 
x inner, psychometric or case h 
¢ © appropriateness of the metric T 
Unction y = f(a, 22, 23 - - + Um) - 


133 


Clinical versus Statistical Prediction 


respect to x; and then repeating this with respect to x;, we exam- 
ine the second-order mixed partial derivative 
By 
dxbxy 

Tf it is not identically zero (or at least equals zero for all values 
of Xi, Xj Within the empirically realized range), we say that the 
predictor variables x; and Xj are patterned with respect to the 
criterion. If this derivative vanishes over the range, they are 
unpatterned. If all the m(m — 1) /2 mixed partials of this sort 
Vanish, the prediction system is unpatterned (or, if it makes any" 
one happy to call it that, “atomistically” related to the criterion) - 
If all these partials are non-zero, the function may be said to be 
patterned pairwise. This is what is being claimed if we say “You 
can’t interpret any Rorschach variable independently of any of 
the others.” Tf there exists a hth-order mixed partial derivative 


bky 0 
< =< BE 
Oxi xa 0x8 . Ux 


but all the (k T+ 1) th order mixed partials vanish, the En 
may be described as patterned of order fe. Finally, if we partially 


differentiate successively with respect to all m of the variables 


and find that the mth order mixed partial derivative 


my 
ee > (SOS 
HX 0X20X3 “wo OX 


is nonvanishing, we say the s 
Tespect to the criterion. 

This last, Very stron 
common claim for mu 
any vari 


| ith 
ystem is totally configurated W! 


interrelation of all the others. 


Ll 
to see anyone actually perio ch 


Quantification of Clinical Material 


the Q correlation is expressible as a function of a sum of squared 
differences (and hence it is nonlinear in the predictors) if one 
were to use it predictively, i.e., to predict the salience of a given 
trait in a self-sort from knowledge of the therapist’s other-sort, 
the usual unpatterned (and linear!) regression equation would 
be employed. It goes without saying that any practical applica- 
tion of genuinely patterned systems, particularly those of high 
order, will require the development of powerful searching methods 
for choosing the functions, and for estimating the constants. It 
seems unlikely, however, that any mathematical progress will free 
us from the necessity of a large N, to counteract the excessive 
sampling instability alluded to above in our discussion of the 
Hovey-Stauffacher investigation. 

A major research need is further empirical comparison of the 
two methods of prediction, with the elimination of the disturbing 
factors mentioned previously. On the formal side, we shall have 
to wait for the logicians to achieve a clarification of the nature 
of the concept of probability, especially the probability of hypoth- 
eses, and the general formulation of inductive logic. Systematic 
studies should be undertaken of the success frequency of certain 
Subsets of the clinician’s predictions. For instance, at what type 
of prediction is he best? What importance should be assigned to 
his own subjective degree of confidence? When the clinician and 
the actuary are in disagreement, to whom should we listen? This 
latter is important because one commonly hears it said by psy- 
chiatrists that they are predicting for the individual case, so that 
the greater success frequency of the actuary, even if clearly estab- 
lished, is treated as of no importance in practice. This thinking is, 
of course, thoroughly muddled. In any given instance, we must 
decide on whom to place our bets; and there is no rational answer 
to this question except in terms of relative frequencies. If, when 
the clinician disagrees with the statistics, he tends to be wrong, 
then, if we put our bets in individual instances upon him, we 


Will tend to be wrong also. 


135 


10 


A Final Word: Unavoidability of Statistics 


Ai clinicians should make up their minds that of the two uses 
of statistics (structural and validating), the validating use is un" 
avoidable. Regardless of one’s theory about personality and rez 
gardless of one’s choice of data, whether Rorschach, MMPI, Bers 
der, age, marital status; regardless of how these data are fuse 
for predictive purposes—by intuition, table, equation, or rational 
hypotheses developed in a case conference—the honest clinician 
cannot avoid the question “Am I doing better than I could do ES 
flipping pennies?” In answer to a demand for validation, one 
sometimes hears it stated that, whereas a certain clinical device 
or method has not been proved valid “in the usual sense,” such 
formal validation is not required, since the instrument has been 
validated by its “clinical usefulness.” When we hear this from 2 
clinician, all we can say is that he thinks he is using it to advan 
tage. Out of the welter of diverse cases, with mixed data a7 
complex judgments, you simply cannot tell whether your use © 
2 procedure is paying off or not. Consider almost any clinical 1m° 
Lent: there are many people, neither fools nor knaves, W 
8 willing to stand up in defense of it. Others, equally competer? 
invoke the same kind of evidence— clinical experience—2s . 
basis for discarding it as useless. The untrustworthiness of clinr 
cal impressions is by no means confined to the behavior disorders 
of course. Over a period of years patients suffering from multiple 
sclerosis were treated by the use of vitamins, diathermy, 0557 


186 


Unavoidability of Statistics 


ministration of spinal cords, high dairy diets, potassium iodide, 
quinine bisulphate, and now we have histamine. All these treat- 
ments found support with certain clinicians on the grounds that 
they were proving themselves useful in clinical practice. Most of 
them were subsequently abandoned when people began to keep 
systematic records of the ultimate results. Among his many vir- 
tues, the characteristic vice of our colleague the psychiatrist is his 
tendency to draw conclusions before graphs, and some detect a 
growing tendency for clinical psychologists to be cheerfully in- 
fected by this vice. 

What can clinical validation legitimately mean? Let us admit 
that validity is to be established by the application of a technique 
in the real life situation. Not all human motives are readily trans- 
plantable to the laboratory. Nevertheless, we must keep track of 
Our guesses. “Leaving the laboratory” is not equivalent to “scrap- 
Ping the rules.” It is a common error to group the terms “quanti- 
tative,” “statistical,” and “experimental” together, setting them 
into opposition with “qualitative,” “clinical  “nonexperimental.” 
I have even heard psychologists use the terms “quantitative” and 
“documentary” in such opposition, whereas it is obvious that the 
quantitative study of documents is a rapidly growing and power- 
ful science. I would defend simultaneously (and, I hope, con- 
sistently) the two propositions that (1) there are some behavior 
Phenomena which cannot be best studied in the laboratory, at 
least with any confidence in one’s extrapolations, and (2) until 
Some quantification, at least frequency counts and contingency 
Measures, is applied to clinical evidence, we can have very little 
confidence in our claims. 


Is any clinician infallible? No one claims to be. Hence, some- 


times he is wrong. If he is sometimes Wrong, why should we pay 
ble reply to this 


any attention to him? There is only one possi 
“Silly” question. It is simply that he tends (read: “is likely”) to 
be 1ight. “Tending” to be right means just one thing— “being 
Tight in the long run.” Can we take the clinician's word for thist 
Certainly not. As psychologists we do not trust our memories, 
and have no recourse except to record our predictions at the time, 


187 


Clinical versus Statistical Prediction 


allow them to accumulate, and ultimately tally them up. We do 
not do this because we have a scientific obsession, but simply 
because we know there is a difference between veridical knowl- 
edge and purported knowledge, between knowledge which brings 
its credentials with it and that which does not. After we tally our 
predictions, the question of success (hits) must be decided upon. 
If we remember that we are Psychologists, this must be done, 
either by some objective criterion, or by some disinterested judge 
Who is not aware of the predictions. When as clinicians we have 
done all these things, and thus provided a secure basis for de- 
ciding how much trust we can put in ourselves, what have we 
done? We have carried out a validation study of the traditional 
kind! I am led by this reasoning to the conclusion, in complete 
Agreement with Sarbin, that the introduction of some special 
“clinical utility” as a Surrogate for validation is inadmissable. If 
the clinical utility is really established and not merely proclaimed, 
it will have been established by procedures which have all the 
earmarks of an acceptable Validation study. If not, it is a weasel 
Phrase and we Ought not to get by with it. 

@ clinician says “This one is different” or “It’s not like the 
ONES In your table,” “This time I'm surer,” the obvious question 
is, “Why should we Care whether you think this one is different 
or whether you are surer?” Again, there is only one rational reply 
DUO & question. We have now to study the success frequency 
R “Sn worn oi Suesses when he asserts that he feels this br 

© already done so, and found him still behind the 
frequency of the table, we would be well advised to ignore Ho 
Always, we might as well face it, the shadow of the statistician 


hovers in th b Ff) 
Un e background; always the actuary will have the fin 


138 


References 


1. Abel, T. “The Operation Called Verstehen,” American Journal of Sociology, 
54:211-18 (1948). 
2. Alexander, F. “Evaluation of Statistical and Analytical Methods in Psy- 
rR and Psychology,” American Journal of Orthopsychiatry, 4:433-38 
934) . 
8. Allport, F, H., and N. Frederiksen. “Personality as a Pattern of Teleonomic 
Trends,” Journal of Social Psychology, 13: 141-82 (1941). 
4. Allport, G. W. Personality. New York: Holt, 1937. 
5. Allport, G. W. The Use of Personal Documents in Psychological Science. 
5.S.R.C. Bulletin No. 49, 1942. 
6. Tt, “The Reliability of Psychiatric Diagnoses, 
and Social Psychology, 44:272-76 (1949). 
% Baldwin, A. pete RE Analysis: A Statistical Method for In- 
Vestigating the Single Personality,” Journal of Abnormal and Social Psy- 
রি Biology, 37:163-83 (1949). 
ToL ও “Some PEt Comite (1953) 
al o. Sulti' 'S 9, 17:235- . 
9. Beck, S. 2 EO Basic Processes. New York: Grune and 
! Stratton, 1944, Me ys 
0 eremann, G. “Holism, Historicism, and Emergence, Philosophy of Science, 
11:909-91 (1944). 
erne, E. “The Nature of Intuition, 
(1949). L 
19. Blenkner, M. “Predictive Factors in the Initial Interview in Family Case- 
Work,” Social Service Review, 28:65-73 (1954) . ll 
loom, R. F., and E. G. Brundage. “Prediction of Success in Ronee 
Schools for Enlisted Personnel,” in D. B. Stuit, ed., Personnel Research an 
ft yr eA in the Bureau of Naval Personnel. Princeton, N.J.: Prince- 
On Universi' 47. 9 i F: 
1. BoBbiLE FN Enea ITED Er olgleal Activities at the Tnited 
15 States Coast Guard Academy,” Psychological Bulletin, 41:568-79 (1944) টল 
* Borden, TH. G. “Factors for Predicting Parole Success, Journal of Americ 
Institute of Criminal Law and Criminoloay. 19:328-36 (1928). 


be Bross, I. D. J. Design for Decision. New York: Macmillan, 1953. 


139 


? Journal of Abnormal 


f Response to Psychotherapy,” 


” Psychiatric Quarterly, 23:203-26 


Clinical versus Statistical Prediction 


1 A. 
ই] 1 Parole,” in A. 
E. W. “Factors Determining Success or Failure on 1 the 
-« Eee The Workings of the Indeterminate Sentence Law an 
FTES t in Illinois. Springfield, Ill, 1928. EA Study 
18 BOSE i “An Experiment in the Standardization of the Case: 
Method,” Sociometry, 4:329-48 (1941). 5 48:8%86 
19. Burgess, E. W. “Rejoinder,” American Journal of Sociology, 
1942) . os AED 
20. Bie E. W., and L. S. Cottrell. Predicting Success or Failure i 
Tiage. New York: Prentice-Hall, 1939. | লী 
21. Burt, C. The Factors of the Mind. New York: Macmillan, 1941. 4 Phenome 
22. Carnap, R. “The Two Concepts of Probability,” Philosophy an 
nological Research, 5:513-32 (1945). ie S007 (1945) - 
23. Carnap, R. “On Inductive Logic,” Philosophy of Science, 12: a Phenome- 
24. Carnap, R. “Remarks on Induction and Truth,” Philosophy an 
nological Research, 6: 590-602 (1946) . . 44:141- 
25. Carnap, R. “Probability as a Guide in Life,” Journal of Philosophy, 
48 (1947). Cs 1d Phe- 
26. Carnap, R. “On the Application of Inductive Logic,” Philosophy an i 
nomenological Research, 8:133-48 (1947). yologicd 
27. Carnap, R. “Reply to Felix Kaufman,” Philosophy and Phenomeén 
Research, 9: 300-4 (1948). 
28. Cattell, R. B. Description and 
World Book Company, 1946. 
29. Cattell, R. B. “P-Tec 


Physiological Source Traits in a Normal Individual,” Psychometrika, 
88 (1947). 


n 
ings? 

30. Chapman, D. W. “The Statistics of the Method of Correct Match 
American Journal of Psychology, 46:287-98 (1934). 

81. Chauncey, Henry. Personal communication, Sarbin’s Ex 

82. Chein, I, “The Logic of Prediction: Some Observations on Dr. >a 
position,” Psychological Review, 53:175-79 (1945). ন Classifica" 

89. Conrad, H. S., and G. 4. Satter. “Use of Test Scores and Quality OSRD 
tion Ratin, 


j A J 
Es in Predicting Success in Electrician’s Mates School, 
Report No. 5667, Sept. 3, 1945. 


4:359 
3%. Cottrell, L. S. “The Case-Study Method in Prediction,” Sociometry, i] 
70 (1941). scan CODY 
85. Davis, F. B. Utilizing Human Talent. Washington, D.C.: Americd 
on Education, 1947. 1D 5s 
MR EAL O DL. BOLT, A.J: Reiss, and FB. Bunton, ‘org: 5738 
Ge Selection Decisions,” American Journal of Sociology: HD 
. j zat! 
87. Dunham, H. W., and B. N. Meltzer. “Predicting Length of Hospitry: 
of Mental Patients,” American Journal of Sociology, 52:123-31 S view 0 4 
88. Dunlap, J. W,, and M. J. Wantman. An Investigation of the Inter ion Re 
Technique for Selec 


নি পি inistratio 

ting Aircraft Pilots, Civil Aeronautics Administ 

port No. 33, Washington, D.C., 1944. 
89. Elkin, F, “‘Speciali 


ialists Interpret the Case of Harold Holzer, 
normal and Social Psychology, 42:99-111 (1947). H 
40. Te Si Sng Tudging Personality from Expressive Behavior, 
oTmal and Social Psychol. :217-: 

Rel Yentology, 83:917-36 (1938). 


35-62 Gonn 
4 Fle RY Hs DRL Hypotheses,” Philosophy of Science, 17: “An 


- C. Cameron, J. M. Bobbitt, and S. H. Newman. 
140 


0 ears, Ned 
Measurement of Personality. Yonkers, 


- ion of PSYCHO 
hnique Demonstrated in the Determination © 12:267- 


b- 
» Journal of 4 


0) 
1 Journal 


References 


tated Medico-Psychological Program at the United States Coast Guard 

a cademy, American Journal of Psychiatry, 101:635-42 (1945). 

- Fenichel, O. Problems of Psychoanalytic Technique. New York: Psycho- 
analytic Quarterly, 1941. i 

kG P. B. “The Theory of Case Studies,” Social Forces, 26:408-19 

45. Freud, S. Collected Papers. Vols. I-V. London: Hogarth Press, 1924, 1950. 

46. Goodenough, F. L. “Expression of the Emotions in a Deaf-Blind Child,” 
Journal of Abnormal and Social Psychology, 27:328-33 (1932). 

47. Guilford, J. P., ed. Printed Classification Tests. AAF Aviation Psychology 
Program Research Report No. 5. Washington, D.C.: US. Government 
Printing Office, 1947. 

48. Hamlin, R. “Predictability of Institutional Adjustment of Reformatory In- 

4 mates,” Journal of Juvenile Research, 18: 179-84 (1954) . 

9. Hanks, L. M. “Prediction of Case Material from Personality Tests,” Archives 
of Psychology, 1936, No. 207. 

50. Harrison, R. “The TAT and Rorschach Methods of Personality Investigation 
in Clinical Practice,” Journal of Psychology, 15:49-74 (1943) . 

51. Hathaway, S. R. “A Coding System for MMPI Profiles,” Journal of Con- 
sulting Psychology, 11:534-37 (1947). 

52. Hathaway, S. R., and P. E. Meehl. An Atlas for the Clinical Use of the 

নি MMPI. Minneapolis: University of Minnesota Press, 1952. 

. Hebb, D. O. “Emotion in Man and Animal: An Analysis of the Intuitive 

yn Processes of Recognition,” Psychological Review, 58:88-106 (1946) . 

54. Hempel, C. G. “The Function of General Laws in History,” in H. Feigl, and 
W. Sellars, eds., Readings in Philosophical Analysis. New York: Appleton- 

চি Century-Crofts, 1949. 

5. Hollingworth, H. L. Judging Human C 
Century, 1923. 

56. Horst, P. “Pattern Analysis and Configural Scoring,” Journal of Clinical 
Psychology, 10:3-11 (1954). 

57, Horst, P. Prediction of Personal Adjustment. S.S.R.C. Bulletin No. 48, 1941. 

58. Hovey, H. B. “MMPI Profiles and Personality Characteristics,” Journal of 

ঠি Consulting Psychology, 17:142-46 (1953) . 

9. Hovey, H. B., and J. C. Stauffacher. “Intuitive versus Objective Prediction 

60 from a Test,” Journal of Clinical Psychology, 9: 349-51 (1953) . 

. Hull, C. L. Principles of Behavior. New York: Appleton-Century, 1943. 

61. Jenkins, R. L. “The Relationship between Scientific and Pre-Scientific 

TPE in Psychiatry and Mental Hygiene,” Mental Hygiene, 29:78-94 
45). 

62. Kelly, E. L., and D. W. Fiske. The Sele 
Tess Report and Preliminary Findings. 
Shop, 1948. 

68. Kelly, E. L., and D. W. Fiske. 
ing Program in Clinical Psychology,” 
(1950). 

64. Kelly, E. L., and D. W. Fiske. The Prediction of Performance in Clinical 

(3 Psychology. Ann Arbor, Mich.: University of Michigan Press, 1951. 

5. Klopfer, B., and D. M. Kelley. The Rorschach Technique. Yonkers, N.Y.: 


66 World Book Company, 1946. 
‘London, I. D. “Psychology an 
Psychological Review, 52:162-68 (1945) . 


141 


haracter. New York: Appleton- 


ction of Clinical Psychologists: Prog- 
Ann Arbor, Mich.: Edwards Letter 


“The Prediction of Success in the V.A. Train- 
» American Psychologist, 5:395—406 


d Heisenberg’s Principle of Indeterminacy,” 


Clinical versus Statistical Prediction 


67. London, I. D. “Some Consequences for History and Psychology of Tt 
muir’s Concept of Convergence and Divergence of Phenomena,” Psyc! 
logical Review, 53:170-88 (1946). 

68. Tutt, J. “Implicit Hypotheses and Clinical Predictions,” Journal of Abnormal 
and Social Psychology, 45:756-59 (1950). 5 Gs VETS 

69. Luft, J. “Differences in Prediction Based on Hearing versus Reading 
batim Clinical Interviews,” Journal of Consulting Psychology, 15:1 
(1951). son 

70. Lundberg, G. A. “Case-Studies vs. Statistical Methods—An Issue Base 
Misunderstanding,” Sociometry , 4:379-83 (1941). Hypo 

71. MacCorquodale, K., and P. E. Meehl. “On a Distinction between Te 
thetical Constructs and Intervening Variables,” Psychological Review, 55: 
107 (1948). 


72. Meehl, P. E. “Configural Scoring,” Journal of Consulting Psychology, ALOT 
71 (1950). চা, 

73. Melton, R. S. “A Comparison of Clinical and Actuarial Methods of -cjanse 
tion with an Assessment of the Relative Accuracy of Different Clinicia 
Unpublished Ph.D. thesis, University of Minnesota, 1952. f the 

74. Munroe, R. “An Experiment in Large Scale Testing by a Modification 0 
Rorschach Method,” Journal of Psychology, 13:229-63 (1942). 


C2 ity 
75. Murray, H. A. Explorations in Personality. New York: Oxford Univers!' 
Press, 1938. 


sontific 
76. Nagel, E. “Some Issues in the Logic of Historical Analysis,” Sciontif 
Monthly, 74:162-69 (1952). Tests 
77. Newman, S. H., and J. M. Bobbitt. “The Development of Enbxey Psy" 
for the United States Coast Guard Academy,” Journal of Applic 
chology, 32:248-54 (1948). ‘ity of the 
78. Newman, S. HJM. Bobbitt, and D. C. Cameron. “The Reliability rican 
Interview Method in an Officer Candidate Evaluation Program,” Ame 
Psychologist, 1:103-9 (1946). lid Per 
vali 


79. Piotrowski, Z. “Differences between Cases Giving Valid and Inv iencd 
sonality Inventory Responses,” Annals of the New York Academy of Sl 
46:633-38 (1946). 

80. Polansky, N. “How Shall a 


sonality, 9:188-207 1941). 
81. Psycholo, ঠ 


Per 
Life-History Be Written?” Character and 


k lion Eee le Eh iO 
Pies, 1085. d Prediction. Chicago: University 


83. Reik, 'T. Listening with the Third EL. 15, 1948. 

’ to Lar. New York: Farrar, Straus, ting 
84. Sarason, S. “The TAT and Subjective Interpretation,” Journal of Cons 
85-99 (1948) . a 6:99 
ap (hn) " Clinical Psychology — Art or Science?” Psychometrika, 


86. Sarbin, T. R.A Co 


tribution to the Study of Actuarial and In 
gi Sabi dition,” American Journal af Sabiolons Ce) ical 
2 , T. R. S he fh ss olog' 
view, 51:910-98 oun of Prediction in Psychology,” Psyoh qogical 
88. Sn 5 R., and R. Taft. An Essay on Inference in the Psycho 
টি EE de DEE Calif.: Garden Library Press, 1952. 
LE projective Methods: Their Origins, Theory, and APP 
ality esearch,” Psychological Bulletin, 49:257-93 (1945) - 


142 


7 
jication * 


References 


90. Schiedt, R. Ein Beitrag zum Problem der Riickfallsprognose. Ph.D. thesis. 
Munich: Miinchner-Zeitungs-Verlag, 1936. 

91. Schneider, A. J. N., C. W. Lagrone, E. T. Glueck, and S. Glueck. “Predic- 
tion of Behavior of Civilian Delinquents in the Armed Forces,” Mental 
Hygiene, 28:456-75 (1944). 

92. Sorokin, P. A. “A Criticism of The Prediction of Personal Adjustment,” 
American Journal of Sociology, 48:76-80 (1942). 

93. Spence, K. W. “The Nature of Theory Construction in Contemporary Psy- 
chology,” Psychological Review, 51:47-68 (1944). 

94. Spence, K. W. “The Postulates and Methods of Behaviorism,” Psychological 


Review, 55:67-78 (1948). 
95. Stouffer, S. A. “Notes on the Case Study and the Unique Case,” Sociometry, 


4:349-57 (1941). 

96. Super, D. E. Appraising Vocational Fitness. New York: Harper, 1949. 

97. Taft, Ronald. “Some Correlates of the Ability to Make Accurate Social 
Judgments.” Unpublished Ph.D. thesis, University of California, 1950. 

98. Taylor, D. W. “An Analysis of Predictions of Delinquency Based on Case 
Studies,” Journal of Abnormal and Social Psychology, 43:45-56 (1947). 

99. Thorndike, E. L. “Fundamental Theorems in Judging Men,” Journal of 
Applied Psychology, 2: 67-76 (1918) . | 

100. Thurstone, L. L. Multiple Factor Analysis. Chicago: University of Chicago 
Press, 1947. 

101. Viteles, M. S. “The Clinical Viewpoint in Vocational Selection,” Journal of 
Applied Psychology, 9:131-38 (1925). Ey E 

102. Vold, G. B. “Comment on Crucial Problems in Methods of Predicting Social 
Adjustment,” Sociometry, 4:374-78 (1941). 

108. Wittman, M. P. “A Scale for Measuring Prognosis in Schizophrenic Patients,” 
Elgin Papers, 4:20-33 (1941). নু! b 

104. Wittman, M. P., and L. Steinberg. “Follow-up of an Objective Evaluation 
of Prognosis in Dementia Praecox and Manic-Depressive Psychoses,” Elgin 


Papers, 5:216-27 (1944). 


143 


Index 


“Actuarial”: as synonym of “inductive,” 
25, 87, 46, 58, 76, 78; Lundberg’s use 
of term, 24, 37, 78 

Actuarial method, definition, 3 

Actuarial table, failure to contain items, 
40, 52-54, 110-11 

Additive assumption, 131-32 

Aircrew training, prediction of success 
in, 96-98, 100-2 

Alexander, F., 22, 30 

Allport, F. H., 88, 89 

Allport, G. W., 4: on prediction from 
class membership, 19-23; on studying 
clinicians’ success, 33; uniqueness the- 
sis, 40, 60, 129; on efficiency of predic- 


tive methods, 83; structural analysis, 
85 


Analysis of variance, interaction terms 
in, 182 

Analytic use of statistics, 11 

Animal behavior, easy to classify re- 
sponses in, 42 

AAF research, 97-98 

Art, meanings of in clinical work, 74-82 

Artistic task of clinician, 74 

Assumptions, Psychological, made in use 
of statistics, 11, 13-14 

Atomistic function, definition of, 134 


Barron, F., 106-7, 119 
Bayes’ Theorem, 62-64 
Behavior, causes under 
Bergmann, G., 45 
Blenkner, M., 107-8 
Bloom, R. F., 105 
Bobbitt, J. M., 102-3 


lying, 19-14, 45 


144, 


Borden, H. G., Te 
X analogy, 57-. 

HB fen as an instrument, 
as a weight-assigner, 39 

Bross, I. D. J., 117 

Brundage, E. G., 105 

Burgess, LE. W., 95-96 

Burt, C., 13 


25-285 


Carnap, R., 35-36 a 
Carnegie, Dale, 70 0d U 
Case study: definition of metho a 18) 
two methods of combining 0 
modes of writing histor oul 
Casework, family, predic 
comes in, 107-8 
Cattell, R. B., 129 % 9 i 
Causation, psychological, 6 10; 
Causes of behavior: not Vs hypo 
and statistical weights, Lructur 
about, see Hypotheses, $ 
namic ial 
Centralists, 10 ia actuar | 
“Chances” of an individuals 
notion, 20, 30 
Chauncey, H., 112-13 


tye 
Chein, 1, 29, 32 confirmabi | 
Clairvoyant predictions, 028 
31-32, 77-18 194 


: Cs 10 

Class: membership, in infers 9: ae 

optimal, for calculating gjdual 

plicit reference to in In ) 

diction, 29-30 ণঃ 17, 
Clerical worker: as predie 3, 59 

trained into a clinician, gg 
Clinical art, meanings of, 


Index 


Clinical method, definition of, 3—t 

Clinical predictions, empirical compari- 
son with statistical, see Empirical com- 
parisons 

Clinical psychology training, predicting 
success in, 98-100 

Clinical skills, generality of, 116-17 

Clinical utility, claimed as substitute for 
validation, 136-38 

Clinical validation, 7, 136-38 

Clinicians: concern with individuals, 25, 
135; as instrument, 26-27, 31, 38, 39; 
statements of as class for basing proba- 
bilities, 34; motivation of, 78-74; varia- 
tion in personal gifts, 79-82; empirical 
findings on variation, 88, 92, 94, 106, 
108; in a prediction equation, 91, 102; 
importance of determining variation 
among, 114-15; difficulty of locating 
superior, 115-17; difficulty of utilizing 
variation among, 115-17; confidence 
of, 117-18, 198 

Coast Guard training, prediction of suc- 
cess in, 102-3 

College grades, predictions of, 90-92; 
105-6, 112-18 

Combination, method of, contrasted with 
data, 15 

Confidence of clinician, 117-18, 138 

Configurated system, 134 

Confirmation, degree of, 36 

Conrad, H. S., 95 

Constructs, to explain behavior, 12. See 
also Hypotheses, structural-dynamic 

Contexts, Reichenbach’s two: of discov- 
8: 26, 66, 78; of justification, 26, 66, 


Correlation, partial, 13-14 

Cost of clinical versus statistical method, 
197-28 

Covariance, analysis of, 14 

be variables, socially defined, 122- 

Cues, subtle: human brain and, 27; not 
subliminal, 70 

Cumulative causation, principle of, 43 


Data versus method of combination, 15- 
18: all possibilities realized, 18; in 
pielensky study, 87-88 
Bavis, F. B., 103 
ecision policy, 7, 117 


Degree of confirmation, 36 

Dependent-variable side, inference from, 
65 

Description, always incomplete, 130 

Discovery, context of, 26, 66, 78 

Discriminative use of statistics, 11-12: 
unavoidable, 136-38 

Dispositions, second-order, 61-62 

Dreams, puns in, 71-72 

Drive-variable, specified by reinforcing 
class, 54, 60-61 

Drives: classification of, 54, 60-61; 
uniqueness of, 60-61 

Duncan, O. D., 17 

Dunham, H. W., 96 

Dunlap, J. W., 100-2 

Dynamic hypotheses, see Hypotheses, 
structural-dynamic 

Dynamic lawfulness, basis of response 
classification, 41-42 


Economic factor in predictive efficiency, 
7, 126-28 

Electrical training, prediction of success 
in, 95 

Empirical comparisons of actuarial and 
clinical prediction methods: Allport's 
examples not such, 84; necessary con- 
ditions for, 84; college grades, 90, 105- 
6, 112-13; schizophrenic prognosis, 92 
94, 96; criminal recidivism, 94-95; elec- 
trical training, 95; parole violation, 95- 
96, 1034; aircrew training, 96-98, 
100-2; clinical psychology training, 98- 
100; Coast Guard Academy training, 
102-3; institutional adjustment, 104-5; 
naval specialty schools, 105; psycho- 
therapy, response to, 106-7; family 
casework outcome, 107-8; nursing Su- 
pervisor ratings, 108-12 

“Empty organism” attitude in predic- 
tion, 124 

Errors, human, 28 

Estes, S. G., 27, 83, 88-89 

Explicit reasons, not necessarily actu- 
arial, 16 

Expressive movement, inferences from, 
70, 88 

Evidence, reasoned, not necessarily ac- 
tuarial, 16 


Factor-analysis, 12-14: as structural- 


145 


Clinical versus Statistical Prediction 


analytic prototype, 12, 14; and causes 
of behavior, 13; in improving validity, 
13; psychological assumptions, 14; P- 
technique, 81 

Facts, perceptual, 27 

Family casework, prediction of outcomes 
in, 107-8 

Feigl, H., 47 

Fenichel, O., 66 

Fiske, D. W., 98-100, 197 

Frederiksen, N., 83, 89 

Frequency, statistical, ordinary words 
refer to, 20, 30, 137 

Frequency theory of probability, 34-36 

Frequentists, and Probability of hypoth- 
eses, 36 

Freud, S., 65 


Geisteswissenschaft, 76 
Generality of clinical skills, 116-17 
Genotypic classification, 41-49 
Gestalt Psychologists, 27 
Glueck, S., prediction tables, 113-14 
Goodenough, F., 97 
Grant, D., 69-64 
Guthrie, E. {খৰ learning theory of, 44 
Hadley, H. D., 96-97 
Hamlin, R., 104-5 
Hathaway, S. R., 109 
Heisenberg Principle, 45 
Hollerith machine, 6, 34, 88, 74, 76 
Hollingworth, H. L., 28 
Holmes, Sherlock, 
actuarial, 16 
Horst, P., 133 
Hovey, H. B., 108-12 
Hullian laws, as d 
Human observati 
Hypotheses 
General and particular, 65 
Invention of: a creative act, nN, 7, 
73; by skilled detective, 64; by engi- 
neer, 66-67; no recipe for, 49-50, 71, 
73, 79; during therapy, 190-21 
Probability of: Lundberg-Sarbin 
thesis, 29, 34-36; Reichenbach on, 35; 
Carnap’s views, 85-36; in inductive 
logic, 35-36, 135 
Structural-dynamic: in Prediction, 
3-4, 56-57; in use of Statistics, 19-14; 
seems nonactuarial, 31; clinician Ve 


rational but non- 


erivative, 44-45 
on, errors, 97-28 


0, 
sus clerk, 46-50; examples of, EN 
71-72; exemplify but do not oy 
from general laws, 49-50; box ane Bu 
57-59; initial probabilities of, Snr 
Plements experience tables, a 
formation of, see Hypotheses, 1 
tion of 

lity, 34 
Identity conception of probability, 

35 i 15 
Impressionistic combination of পা Pe, 
Individuals: need not be elemen চ?-, 

tuarial table, 16, 20-22; Pert 0 

from correlation, 23; major ঃ to, 135, 

clinician, 25; statistics relevant to, 

ia S 35-30; 
Inductive logic: Carnap’s le 

in primitive condition, 36, 1 ল E 
Inference: rational not necessarily 

tical, 16; interpretive, 40. ta, 16 
Informal methods of tone fr 
Initial conditions, inaccessibility of 

56 NE . 
Inner events: hypothetical, Ate 

analogy, 57-59. See also HyP' 

Structural-dynamic sotio 
Institutional adjustment, predic 

104-5 _35 
Interaction of variables, 110, “non 
Intuition, not synonymous Ww 

actuarial,” 16 rg 
Inventories, traditional, 

R a; LU 
Inverse probability: and ন 
namic hypotheses, 62-64; in 

bution, 111-12 


tatis- 


os 0 


ral-dy” 
t attri" 


Judges, differences among, দি 81 
Justification, context of, 26, 66, 


Kelly, E. L., 6, 98-100, 127 


Rs 
- 45; Pe 
Law: Hullian, as derivative, kn 4s 


47; need not relate Ob SUS 
end-terms of, 54-55, 64; to! a 
Parameters, 64 jquene 
Awfulness: in relation to of sponte 
40, 64-65, 78-79; basis 0 
classification, 41-42 
Leniency error, 92, 106 1 
Lepley, W. M., 96-07 ent ats 
Levels of data, human judg 
18 


146 


Index 


Lewin, K., 41 

Life experiences, partially inaccessible, 
55-56 

Logicians, on inductive logic, 36, 135 

London, I. D., 45, 61 

Lundberg, G. A., 4, 229: use of term “ac- 
tuarial,” 24, 37; on clinician's weight- 
ing of factors, 25; theoretical position 
of, $1; on actuarial study of single 
person, 38 


Manic-depressive psychosis, prognosis in, 
94, 96 

Matchings, method of, 12 

Meaning criterion, verifiability, 29-33, 
34-36, 75 

Mechanical combination defined, 15-16 

Mechanical example of prediction prob- 
lem, 57-59 

Melton, R. S., 105-6 

Meltzer, B. N., 96 

MMPI: predictions from by two meth- 
ods, 106-7, 108-12; rarity of certain 
15 110-11; an economical device, 

MMPI Atlas, 110-11 

Motivation: classification of, 54, 60-61; 
Of some clinicians, 73-74 

Tovement, casework, prediction of, 107- 


Multiple sclerosis, 136-37 
Murray, H. A., 4, 10, 26 


Navy specialist training, predicting suc- 
cess in, 105 
eed-variables, unique, 54, 60-61 
Newman, S. H., 102-3 
Nondeductive, not irrational, 78 
Nonlinearity, 130-31: not synonymous 
With “patterning,” 133 
onpsychometric data, two methods of 
combining, 18 
ovelty: level of laws and, 45; rational 
understanding, 78-79 
Ursing supervisor ratings, prediction of, 
108-12 


Observation, errors, 27-28 

Ohlin, L. E., 17 

Orderliness in behavior, see Law, Law- 
ulness 


P-technique, 81 

Parameters: and uniqueness, 41; 
at birth unknown, 55 

Parole violation, prediction of, 95-6, 
103-4 

Partial-correlation, causal inferences 
from, 14 

Patterning: meaning of, 132-35; special 
case of nonlinearity, 133 

Peripheralists, 10 

Person, forming conception of, 4, 46, 
123-24 

Personality as a structure, 61-62 

Phenotypic classification, 41 

Physicalism, as epistemological thesis, 44 

Physicalistic specification of response 
classes, 4244 

Polansky, N., 83-88 

Positivism, 5, 29-30, 34 

Prediction: verifiability of clairvoyant, 
31-32, 77-78; statements as a class, 
83-34; specificity of problem, 116; 
during therapy, 120-21; task unlike 
therapeutic task, 120-26; of specific 
events versus outcomes, 122-26; form 
in which task presented, 125-26; eco- 
nomic factors in efficiency of, 126-28; 
empirical comparison of two methods, 
see Empirical comparisons 

Predictive statements, as a class for 
which frequencies are calculable, 38- 
34 

Press, unpredictability of, 129-24 

Probabilities, hierarchies of equally cor- 
rect, 34 

Probability: true, 34; two concepts of, 
34-36; theory mechanically applicable, 
56-57 

Professional relationships, affected by 
predictive philosophy, 7 

Projective methods, 4, 7, 128 

Psychological assumptions in statistics, 
11, 13-14 

Psychological causation not actuarial, 6, 
19, 45 

Psychological hypotheses about inner 
events, see Hypotheses, structural- 
dynamic 

Psychologist, unique tools of, 7 

Psychometric data: defined, 15; com- 
bined mechanically, 18; abstractness 
of, 87, 129-30 


147 


Clinical versus Statistical Prediction 


Psychometric description of a person: 
judges’ reaction to, 85; abstractness 
of, 87, 129-30 

Psychotherapy: predicting response to, 
7, 10-11, 106-7, 119, 124-25; predic- 
tions during, 48-51, 120-28, 125-26 

Puns in dreams, 71-72 


Qualitative concepts, precede quantifi- 
cation, 41 


R-R laws, 47 

Rapaport, D., 6, 38 

Rarity of crucial factors, 25 

Rational argument, need not be actu- 
arial, 16 

Recidivism, prediction of, 94-95 

Reichenbach, H., 22, 82, 36 

Reik, T., 50, 65, 72 

Reiss, A. J, 117 

Research attitudes, 8 

Response class: specification of, 41; pre- 
cedes quantification, 41; and reinforce 
ment. conditions, 42; cultural genera- 
tion of, 44 

Rotation problem in factor analysis, 18 


Sampling errors: capitalization on, 110, 
115; in patterned prediction functions, 
135 

Sarason, S., 78 

Sarbin, T. R., 4, 29: 


On a person’s 
“chances,” 


20; view of all prediction 
as actuarial, 22; on clinician's weight- 
ing of factors, 25, 38-39; use of term 
“actuarial,” 46; on art in clinical PSy- 
chology, 78-82; empirical comparison 


of predictions, 90-92; generalization 
made by, 119 


Satter, G. A., 95 
Schiedt, R., 904-95 
Schizophrenia, prognosis. 

Schneider, A. J. NS 514 08 
Scientific training, 
methods, 88-89 
Scores, identical, differ in meaning, 129- 

30 f 
Sensitization to hypotheses, 52 
Shock treatment: Prognosis, 6, 12; em. 
pirical studies, 92-94 5] 
Shrinkage, Cross-validation, 
figurated prediction func! 


and impressionistic 


Severe in con- 
tions, 110, 195 


Single case: as source of actuarial data, 
20-22; verifiability of predictions, 29- 
33; lawfulness detectable, 78-79 

Skinner, B. FE., 41-43 Es 

Social behavior, difficulty of specifying 
response class, 42-44, 54-55 ool 

Socially defined criterion variables, 122 
24 HCl 

Sociological clusters, among Psycho 
ogists, 10 

Spence, K., 47 

Stanton, H. R., 117 E i 

Statistical predictions, empirical bE 
parison with clinical, see Empirt 
comparisons 

Statistics, two uses of, 11-15, 136 

Stauffacher, J. C., 108-12 

Steinberg, L., 94 

Stouffer, S. A., 24, 130 

Structural analysis, Allport's, 84-85 

Structural-analytic use of statistics, 
12-15 

Structural hypotheses, see Hypotheses 
structural-dynamic 

Structure, personality as, 61-62 

Super, D. E., 96-98 


11, 


be 
Table, actuarial: elements need not 


persons, 16, 20-22, 38 

Test, definition of, 15 

Testimony, psychology of, 28 

‘Therapists: shortage of, 7; €0! 
of, 121 ি 

Therapy: timing in, 81; prediction 
postdictions during, 120-21; tan nee 
like prognosis, 120-26; and © 
knowledge, 127-28 

Thorndike, E. L., 181 

Thurstone, L. L., 130 এ series 

Time: shortage of therapeutic, in efi 
on individual, 21, 81; factor 3 
ciency of methods, 126-28 

Timing in therapy, 81 

Topography, response, 


nfidence 


ions and 
uns 


41, 42, 44 


0 AOE 
Understanding, subjective feeling yom 
Unique event: inferences abou 65, 
class, 229; and lawfulness, $0 40, 
78-79; not peculiar to human ing; 
129-30; difficulty of classify in6s 
50; from few traits, 130 

Units, equality of, 130-31 


148 


L, 


Index 


Usefulness, as claim of validity, Verbalizing cues, 69-70 
136-38 Verifiability criterion, 29-38, 84-36, 75 
Verstehen, 74 


Validating use of statistics, 11-15, 136 
'alidation, clinical, 7, 136-38 Wantman, M. J., 100-2 

ariance of criterion, underestimation of, Weights, 38: alteration by clinician, 24- 
92 “25, 38-39; inefficient, 92, 109-10, 114, 
erbal response, clinician’s: member of 118, 121; unstable in configurated pre- 
a statistical class, 38-34; appropriate diction functions, 110-11, 135 

but not descriptive of cue basis, 69-70 Wittman, P., 17, 92-94 


149 


Form ho. 3. 


PSY, RES.L-1 


Bureau of Educational & Psychological 
Research Library. 
চৰ a — SHEE Ca 


KB, 1, be returned within 
the date Stamped last. 


ESE ও ১ RE  ————————— 
WBGP-59/60.51 190.5 yr 


from front flap 


diction methods as to their actual 
efficiency in practice. He analyzes tra 
ditional Statements of ihe methodo. 


logical uestions involved and exposes 
1 gq 


Current confusions and clichés on botih' 
sides. } 


“The d‘Seussion avoids 


Ir O00 


a partisan 
OT ন 5 5 

vViewpoini; Presenting, Instead, a bal. 
anced, unbiased synthesis of the d‘ffer. 


ent viewpoints. Since this is 1 


time the factual material has been re. 


viewed and analyzed impartially, the 


work will serve as a useful Suidepost 


in helping to solve a dilemma that 


faces many Psychologists, statisticians, 


Sociologists, and Psychiatrists. 
Dr. Meehl is chairman of the 


ment of psycholo 


depart. 


UNIVERSITY OF MINNESOTA 
ee PRESS, Minneapolis 


An Ailas for ihe Clinical Use 
of ihe MMPI 


by STARKE R. HATHAWAY and PAUL E. MEEHL { { 


A reference book for users of the Minnesota Moultiphasic Per- 
sonality Inventory, this volume presents 968 short case his- ( 
tories with one or more associated MMPI profiles for each. 
Psychiatric Quarterly points out that “this volume will better 
enable the clinician to usr much of theemeaningful material 


$9.75 


that the inventory often obtains.” 


Analyzing and Predicting ্ 
Juvenile Delinquency with the MMPI 


edited by STARKE R. HATHAWAY and 
ELIO D. MONASH 


A report on a series of pioneer studies i investigating the possi- 
bilities of using the Minnesota Multiphasic Personality Inven- 
tory as an objeciive instrument to help identify youngsters 
who are likely or unlikely to become delinquent. The Journal 
of Criminal Law, Criminology, and Police Science says Drs. 
Hathaway and Monachesi “‘have done an outstanding job in 
assembling and editing an analysis leading to some prediction 
in delitduency trends.” $3.50 


UNIVERSITY OF. MINNESOTA PRESS, Minneapolis 


